developer_uid: chai_evaluation_service
submission_id: evelyn777-chai-sft-3b-v2_v1
model_name: evelyn777-chai-sft-3b-v2_v1
model_group: evelyn777/chai-sft-3b-v2
status: inactive
timestamp: 2026-02-08T03:17:06+00:00
num_battles: 10490
num_wins: 3632
celo_rating: 1197.62
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: basic
model_repo: evelyn777/chai-sft-3b-v2
model_architecture: Qwen2ForCausalLM
model_num_parameters: 3397011456.0
best_of: 8
max_input_tokens: 2048
max_output_tokens: 64
reward_model: default
display_name: evelyn777-chai-sft-3b-v2_v1
is_internal_developer: True
language_model: evelyn777/chai-sft-3b-v2
model_size: 3B
ranking_group: single
us_pacific_date: 2026-02-07
win_ratio: 0.34623450905624403
generation_params: {'temperature': 0.85, 'top_p': 0.9, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '<|im_start|>system\n{memory}<|im_end|>\n', 'prompt_template': '<|im_start|>user\n{prompt}<|im_end|>\n', 'bot_template': '<|im_start|>assistant\n{bot_name}: {message}<|im_end|>\n', 'user_template': '<|im_start|>user\n{user_name}: {message}<|im_end|>\n', 'response_template': '<|im_start|>assistant\n{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name evelyn777-chai-sft-3b-v2-v1-uploader
Waiting for job on evelyn777-chai-sft-3b-v2-v1-uploader to finish
evelyn777-chai-sft-3b-v2-v1-uploader: Using quantization_mode: none
evelyn777-chai-sft-3b-v2-v1-uploader: Downloading snapshot of evelyn777/chai-sft-3b-v2...
evelyn777-chai-sft-3b-v2-v1-uploader: Fetching 13 files: 0%| | 0/13 [00:00<?, ?it/s] Fetching 13 files: 8%|▊ | 1/13 [00:00<00:03, 3.20it/s] Fetching 13 files: 46%|████▌ | 6/13 [00:00<00:00, 14.46it/s] Fetching 13 files: 62%|██████▏ | 8/13 [00:04<00:03, 1.53it/s] Fetching 13 files: 100%|██████████| 13/13 [00:04<00:00, 3.14it/s]
evelyn777-chai-sft-3b-v2-v1-uploader: Downloaded in 4.415s
evelyn777-chai-sft-3b-v2-v1-uploader: Processed model evelyn777/chai-sft-3b-v2 in 6.744s
evelyn777-chai-sft-3b-v2-v1-uploader: creating bucket guanaco-vllm-models
evelyn777-chai-sft-3b-v2-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
evelyn777-chai-sft-3b-v2-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
evelyn777-chai-sft-3b-v2-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
evelyn777-chai-sft-3b-v2-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
evelyn777-chai-sft-3b-v2-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
evelyn777-chai-sft-3b-v2-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
evelyn777-chai-sft-3b-v2-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
evelyn777-chai-sft-3b-v2-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
evelyn777-chai-sft-3b-v2-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
evelyn777-chai-sft-3b-v2-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
evelyn777-chai-sft-3b-v2-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
evelyn777-chai-sft-3b-v2-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
evelyn777-chai-sft-3b-v2-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
evelyn777-chai-sft-3b-v2-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
evelyn777-chai-sft-3b-v2-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
evelyn777-chai-sft-3b-v2-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
evelyn777-chai-sft-3b-v2-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
evelyn777-chai-sft-3b-v2-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v2-v1
evelyn777-chai-sft-3b-v2-v1-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v2-v1/.gitattributes
evelyn777-chai-sft-3b-v2-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v2-v1/config.json
evelyn777-chai-sft-3b-v2-v1-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v2-v1/special_tokens_map.json
evelyn777-chai-sft-3b-v2-v1-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v2-v1/generation_config.json
evelyn777-chai-sft-3b-v2-v1-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v2-v1/model.safetensors.index.json
evelyn777-chai-sft-3b-v2-v1-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v2-v1/added_tokens.json
evelyn777-chai-sft-3b-v2-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v2-v1/tokenizer_config.json
evelyn777-chai-sft-3b-v2-v1-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v2-v1/chat_template.jinja
evelyn777-chai-sft-3b-v2-v1-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v2-v1/merges.txt
evelyn777-chai-sft-3b-v2-v1-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v2-v1/vocab.json
evelyn777-chai-sft-3b-v2-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v2-v1/tokenizer.json
evelyn777-chai-sft-3b-v2-v1-uploader: cp /dev/shm/model_output/model-00002-of-00002.safetensors s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v2-v1/model-00002-of-00002.safetensors
evelyn777-chai-sft-3b-v2-v1-uploader: cp /dev/shm/model_output/model-00001-of-00002.safetensors s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v2-v1/model-00001-of-00002.safetensors
Job evelyn777-chai-sft-3b-v2-v1-uploader completed after 62.82s with status: succeeded
Stopping job with name evelyn777-chai-sft-3b-v2-v1-uploader
Pipeline stage VLLMUploader completed in 63.72s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service evelyn777-chai-sft-3b-v2-v1
Waiting for inference service evelyn777-chai-sft-3b-v2-v1 to be ready
HTTP Request: %s %s "%s %d %s"
Inference service evelyn777-chai-sft-3b-v2-v1 ready after 171.37324285507202s
Pipeline stage VLLMDeployer completed in 171.78s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.2049269676208496s
Received healthy response to inference request in 0.40386366844177246s
Received healthy response to inference request in 0.5359532833099365s
Received healthy response to inference request in 0.6057980060577393s
Received healthy response to inference request in 1.3156394958496094s
Received healthy response to inference request in 0.4224388599395752s
Received healthy response to inference request in 0.43027472496032715s
Received healthy response to inference request in 0.40160322189331055s
Received healthy response to inference request in 1.221550703048706s
Received healthy response to inference request in 0.5694904327392578s
Received healthy response to inference request in 0.48583436012268066s
Received healthy response to inference request in 0.7652857303619385s
Received healthy response to inference request in 0.708122968673706s
Received healthy response to inference request in 0.4273533821105957s
Received healthy response to inference request in 1.033994436264038s
Received healthy response to inference request in 0.9101884365081787s
Received healthy response to inference request in 0.632554292678833s
Received healthy response to inference request in 0.5113465785980225s
Received healthy response to inference request in 1.8983163833618164s
Received healthy response to inference request in 0.8038809299468994s
Received healthy response to inference request in 0.46926379203796387s
Received healthy response to inference request in 0.5519230365753174s
Received healthy response to inference request in 0.570284366607666s
Received healthy response to inference request in 0.48685741424560547s
Received healthy response to inference request in 0.9450385570526123s
Received healthy response to inference request in 0.5561373233795166s
Received healthy response to inference request in 0.4637482166290283s
Received healthy response to inference request in 0.5994908809661865s
Received healthy response to inference request in 0.7126648426055908s
Received healthy response to inference request in 0.5300374031066895s
30 requests
0 failed requests
5th percentile: 0.4122225046157837
10th percentile: 0.42686192989349364
20th percentile: 0.46816067695617675
30th percentile: 0.5039998292922974
40th percentile: 0.5455351352691651
50th percentile: 0.5698873996734619
60th percentile: 0.6165005207061767
70th percentile: 0.728451108932495
80th percentile: 0.9171584606170655
90th percentile: 1.2065893411636353
95th percentile: 1.2732995390892026
99th percentile: 1.7293400859832768
mean time: 0.705795423189799
Pipeline stage StressChecker completed in 24.08s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.95s
Shutdown handler de-registered
evelyn777-chai-sft-3b-v2_v1 status is now deployed due to DeploymentManager action
evelyn777-chai-sft-3b-v2_v1 status is now inactive due to auto deactivation removed underperforming models