developer_uid: rirv938
submission_id: chaiml-reward-dpo-434a-_51521_v5
model_name: chaiml-reward-dpo-434a-_51521_v5
model_group: ChaiML/reward-dpo-434a-c
status: inactive
timestamp: 2026-04-06T20:57:12+00:00
num_battles: 1027324
num_wins: 542518
celo_rating: 1322.19
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: basic
model_repo: ChaiML/reward-dpo-434a-chaiml-glm-air-4-5-sft-_62396_v1
model_architecture: Glm4MoeForCausalLM
model_num_parameters: 9073971200.0
best_of: 8
max_input_tokens: 2048
max_output_tokens: 80
reward_model: default
display_name: chaiml-reward-dpo-434a-_51521_v5
ineligible_reason: max_output_tokens!=64
is_internal_developer: True
language_model: ChaiML/reward-dpo-434a-chaiml-glm-air-4-5-sft-_62396_v1
model_size: 9B
ranking_group: single
us_pacific_date: 2026-04-06
win_ratio: 0.5280885095646554
generation_params: {'temperature': 1.0, 'top_p': 0.95, 'min_p': 0.05, 'top_k': 60, 'presence_penalty': 0.1, 'frequency_penalty': 0.0, 'stopping_words': ['###', '</s>', '<|im_end|>', 'You:', '<|im_start|>'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 80}
formatter: {'memory_template': '[gMASK]<sop><|system|>\nRespond as a high quality storyteller.<|user|>\n', 'prompt_template': '', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '<|assistant|>\n<think></think>\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-reward-dpo-434a-51521-v5-uploader
Waiting for job on chaiml-reward-dpo-434a-51521-v5-uploader to finish
chaiml-reward-dpo-434a-51521-v5-uploader: Using quantization_mode: none
chaiml-reward-dpo-434a-51521-v5-uploader: Downloading snapshot of ChaiML/reward-dpo-434a-chaiml-glm-air-4-5-sft-_62396_v1...
chaiml-reward-dpo-434a-51521-v5-uploader: Processed model ChaiML/reward-dpo-434a-chaiml-glm-air-4-5-sft-_62396_v1 in 154.831s
chaiml-reward-dpo-434a-51521-v5-uploader: creating bucket guanaco-vllm-models
chaiml-reward-dpo-434a-51521-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-reward-dpo-434a-51521-v5-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-reward-dpo-434a-51521-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-reward-dpo-434a-51521-v5-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-reward-dpo-434a-51521-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-reward-dpo-434a-51521-v5-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-reward-dpo-434a-51521-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-reward-dpo-434a-51521-v5-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-reward-dpo-434a-51521-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-reward-dpo-434a-51521-v5-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-reward-dpo-434a-51521-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-reward-dpo-434a-51521-v5-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-reward-dpo-434a-51521-v5-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-reward-dpo-434a-51521-v5-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-reward-dpo-434a-51521-v5-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-reward-dpo-434a-51521-v5-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-reward-dpo-434a-51521-v5-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-reward-dpo-434a-51521-v5-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/chat_template.jinja
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/README.md
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/args.json s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/args.json
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/config.json
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/.gitattributes
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/special_tokens_map.json
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/tokenizer_config.json
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model.safetensors.index.json
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/tokenizer.json
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00043-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00043-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00007-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00007-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00037-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00037-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00003-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00003-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00022-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00022-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00005-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00005-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00033-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00033-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00039-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00039-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00009-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00009-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00012-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00012-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00013-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00013-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00034-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00034-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00001-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00001-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00010-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00010-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00036-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00036-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00006-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00006-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00038-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00038-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00002-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00002-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00041-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00041-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00004-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00004-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00032-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00032-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00035-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00035-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00020-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00020-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00031-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00031-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00030-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00030-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00024-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00024-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00021-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00021-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00040-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00040-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00016-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00016-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00028-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00028-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00027-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00027-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00026-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00026-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00015-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00015-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00014-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00014-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00023-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00023-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00029-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00029-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00018-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00018-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00025-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00025-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00017-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00017-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00011-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00011-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00042-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00042-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00019-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00019-of-00043.safetensors
chaiml-reward-dpo-434a-51521-v5-uploader: cp /dev/shm/model_output/model-00008-of-00043.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-434a-51521-v5/default/model-00008-of-00043.safetensors
Job chaiml-reward-dpo-434a-51521-v5-uploader completed after 225.92s with status: succeeded
Stopping job with name chaiml-reward-dpo-434a-51521-v5-uploader
Pipeline stage VLLMUploader completed in 227.16s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.63s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-reward-dpo-434a-51521-v5
Waiting for inference service chaiml-reward-dpo-434a-51521-v5 to be ready
Inference service chaiml-reward-dpo-434a-51521-v5 ready after 273.88972640037537s
Pipeline stage VLLMDeployer completed in 274.94s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.7045819759368896s
Received healthy response to inference request in 2.1165406703948975s
Received healthy response to inference request in 2.2408041954040527s
Received healthy response to inference request in 2.1528983116149902s
Received healthy response to inference request in 2.3623573780059814s
Received healthy response to inference request in 2.1836955547332764s
Received healthy response to inference request in 2.247235059738159s
Received healthy response to inference request in 2.249321460723877s
Received healthy response to inference request in 2.1764163970947266s
Received healthy response to inference request in 2.12288236618042s
Received healthy response to inference request in 2.821547746658325s
Received healthy response to inference request in 2.1540403366088867s
Received healthy response to inference request in 2.2787671089172363s
Received healthy response to inference request in 2.2051873207092285s
Received healthy response to inference request in 2.357527256011963s
Received healthy response to inference request in 2.207740068435669s
Received healthy response to inference request in 2.1195812225341797s
Received healthy response to inference request in 2.2530150413513184s
Received healthy response to inference request in 2.287503480911255s
Received healthy response to inference request in 2.3843812942504883s
Received healthy response to inference request in 2.139456033706665s
Received healthy response to inference request in 2.2838947772979736s
Received healthy response to inference request in 2.711512327194214s
Received healthy response to inference request in 2.191019058227539s
Received healthy response to inference request in 2.136230230331421s
Received healthy response to inference request in 2.1458523273468018s
Received healthy response to inference request in 2.1851325035095215s
Received healthy response to inference request in 2.268540620803833s
Received healthy response to inference request in 2.243255376815796s
Received healthy response to inference request in 2.1459779739379883s
30 requests
0 failed requests
5th percentile: 2.1210667371749876
10th percentile: 2.134895443916321
20th percentile: 2.145952844619751
30th percentile: 2.1697035789489747
40th percentile: 2.188664436340332
50th percentile: 2.224272131919861
60th percentile: 2.2480696201324464
70th percentile: 2.271608567237854
80th percentile: 2.3015082359313968
90th percentile: 2.416401362419129
95th percentile: 2.708393669128418
99th percentile: 2.789637475013733
mean time: 2.269229849179586
Pipeline stage StressChecker completed in 86.14s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.37s
Shutdown handler de-registered
chaiml-reward-dpo-434a-_51521_v5 status is now deployed due to DeploymentManager action
chaiml-reward-dpo-434a-_51521_v5 status is now inactive due to auto deactivation removed underperforming models