developer_uid: rirv938
submission_id: chaiml-reward-dpo-434a-_51521_v3
model_name: chaiml-reward-dpo-434a-_51521_v3
model_group: ChaiML/reward-dpo-434a-c
status: inactive
timestamp: 2026-03-09T21:48:53+00:00
num_battles: 13914
num_wins: 7439
celo_rating: 1307.35
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: basic
model_repo: ChaiML/reward-dpo-434a-chaiml-glm-air-4-5-sft-_62396_v1
model_architecture: Glm4MoeForCausalLM
model_num_parameters: 9073971200.0
best_of: 8
max_input_tokens: 2048
max_output_tokens: 80
reward_model: default
display_name: chaiml-reward-dpo-434a-_51521_v3
ineligible_reason: max_output_tokens!=64
is_internal_developer: True
language_model: ChaiML/reward-dpo-434a-chaiml-glm-air-4-5-sft-_62396_v1
model_size: 9B
ranking_group: single
us_pacific_date: 2026-03-09
win_ratio: 0.5346413684059221
generation_params: {'temperature': 1.0, 'top_p': 0.95, 'min_p': 0.05, 'top_k': 60, 'presence_penalty': 0.1, 'frequency_penalty': 0.0, 'stopping_words': ['You:', '<|im_end|>', '</s>', '###', '<|im_start|>'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 80}
formatter: {'memory_template': '[gMASK]<sop><|system|>\nRespond as a high quality storyteller.<|user|>\n', 'prompt_template': '', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '<|assistant|>\n<think></think>\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-reward-dpo-434a-51521-v3-uploader
Waiting for job on chaiml-reward-dpo-434a-51521-v3-uploader to finish
chaiml-reward-dpo-434a-51521-v3-uploader: Using quantization_mode: none
chaiml-reward-dpo-434a-51521-v3-uploader: Downloading snapshot of ChaiML/reward-dpo-434a-chaiml-glm-air-4-5-sft-_62396_v1...
chaiml-reward-dpo-434a-51521-v3-uploader: Downloaded in 87.398s
chaiml-reward-dpo-434a-51521-v3-uploader: Processed model ChaiML/reward-dpo-434a-chaiml-glm-air-4-5-sft-_62396_v1 in 165.857s
chaiml-reward-dpo-434a-51521-v3-uploader: creating bucket guanaco-vllm-models
chaiml-reward-dpo-434a-51521-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-reward-dpo-434a-51521-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-reward-dpo-434a-51521-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-reward-dpo-434a-51521-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-reward-dpo-434a-51521-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-reward-dpo-434a-51521-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-reward-dpo-434a-51521-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-reward-dpo-434a-51521-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-reward-dpo-434a-51521-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-reward-dpo-434a-51521-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-reward-dpo-434a-51521-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-reward-dpo-434a-51521-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-reward-dpo-434a-51521-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-reward-dpo-434a-51521-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-reward-dpo-434a-51521-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-reward-dpo-434a-51521-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-reward-dpo-434a-51521-v3-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-reward-dpo-434a-51521-v3-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/.gitattributes
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/chat_template.jinja
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/args.json s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/args.json
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/tokenizer_config.json
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/README.md
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/config.json
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model.safetensors.index.json
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/special_tokens_map.json
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/tokenizer.json
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00010-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00010-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00017-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00017-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00043-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00043-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00012-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00012-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00013-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00013-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00042-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00042-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00041-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00041-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00021-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00021-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00004-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00004-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00028-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00028-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00039-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00039-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00020-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00020-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00008-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00008-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00009-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00009-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00025-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00025-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00030-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00030-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00032-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00032-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00011-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00011-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00035-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00035-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00029-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00029-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00015-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00015-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00036-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00036-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00002-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00002-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00023-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00023-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00038-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00038-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00027-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00027-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00026-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00026-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00024-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00024-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00034-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00034-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00037-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00037-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00006-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00006-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00018-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00018-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00019-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00019-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00033-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00033-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00001-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00001-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00031-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00031-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00007-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00007-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00022-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00022-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00016-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00016-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00040-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00040-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00005-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00005-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v3-uploader: cp /dev/shm/model_output/model-00003-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v3/default/model-00003-of-00043.safetensors
Job chaiml-reward-dpo-434a-51521-v3-uploader completed after 236.93s with status: succeeded
Stopping job with name chaiml-reward-dpo-434a-51521-v3-uploader
Pipeline stage VLLMUploader completed in 238.22s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.91s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-reward-dpo-434a-51521-v3
Waiting for inference service chaiml-reward-dpo-434a-51521-v3 to be ready
Inference service chaiml-reward-dpo-434a-51521-v3 ready after 294.20774126052856s
Pipeline stage VLLMDeployer completed in 295.26s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.23626708984375s
Received healthy response to inference request in 2.7010278701782227s
Received healthy response to inference request in 2.409679889678955s
Received healthy response to inference request in 2.293898820877075s
Received healthy response to inference request in 2.494403600692749s
Received healthy response to inference request in 2.423074722290039s
Received healthy response to inference request in 2.632021903991699s
Received healthy response to inference request in 2.251234292984009s
Received healthy response to inference request in 2.4638636112213135s
Received healthy response to inference request in 2.5368306636810303s
Received healthy response to inference request in 2.2213854789733887s
Received healthy response to inference request in 2.179374933242798s
Received healthy response to inference request in 2.403385639190674s
Received healthy response to inference request in 2.388150691986084s
Received healthy response to inference request in 2.3153979778289795s
Received healthy response to inference request in 2.4912221431732178s
Received healthy response to inference request in 2.380303144454956s
Received healthy response to inference request in 2.189296007156372s
Received healthy response to inference request in 2.613694190979004s
Received healthy response to inference request in 2.529881238937378s
Received healthy response to inference request in 2.308821439743042s
Received healthy response to inference request in 2.2294845581054688s
Received healthy response to inference request in 3.063305377960205s
Received healthy response to inference request in 2.337890386581421s
Received healthy response to inference request in 2.461939573287964s
Received healthy response to inference request in 2.25300669670105s
Received healthy response to inference request in 2.2254960536956787s
Received healthy response to inference request in 2.286345958709717s
Received healthy response to inference request in 2.3172457218170166s
Received healthy response to inference request in 2.596599578857422s
30 requests
0 failed requests
5th percentile: 2.2037362694740295
10th percentile: 2.22508499622345
20th percentile: 2.2526522159576414
30th percentile: 2.304344654083252
40th percentile: 2.329632520675659
50th percentile: 2.395768165588379
60th percentile: 2.438620662689209
70th percentile: 2.492176580429077
80th percentile: 2.5487844467163088
90th percentile: 2.638922500610352
95th percentile: 2.900280499458312
99th percentile: 3.186108193397522
mean time: 2.441150975227356
Pipeline stage StressChecker completed in 79.97s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.38s
Shutdown handler de-registered
chaiml-reward-dpo-434a-_51521_v3 status is now deployed due to DeploymentManager action
chaiml-reward-dpo-434a-_51521_v3 status is now inactive due to auto deactivation removed underperforming models