developer_uid: rirv938
submission_id: chaiml-reward-dpo-0e8a-_16483_v1
model_name: chaiml-reward-dpo-0e8a-_16483_v1
model_group: ChaiML/reward-dpo-0e8a-c
status: protected
timestamp: 2026-01-31T01:03:09+00:00
num_battles: 11332
num_wins: 6465
celo_rating: 1347.88
family_friendly_score: 0.55
family_friendly_standard_error: 0.007035623639735145
submission_type: basic
model_repo: ChaiML/reward-dpo-0e8a-chaiml-kimid-v9-opusdv_83165_v12-int4-mixed
model_architecture: Qwen3MoeForCausalLM
model_num_parameters: 18790207488.0
best_of: 4
max_input_tokens: 2048
max_output_tokens: 80
reward_model: default
display_name: chaiml-reward-dpo-0e8a-_16483_v1
ineligible_reason: max_output_tokens!=64
is_internal_developer: True
language_model: ChaiML/reward-dpo-0e8a-chaiml-kimid-v9-opusdv_83165_v12-int4-mixed
model_size: 19B
ranking_group: single
us_pacific_date: 2026-01-28
win_ratio: 0.5705082950935404
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['</think>', '<|im_end|>', '<|user|>', '####', '</s>', '<|assistant|>'], 'max_input_tokens': 2048, 'best_of': 4, 'max_output_tokens': 80}
formatter: {'memory_template': "<|im_start|>system\n{bot_name}'s persona: {memory}<|im_end|>\n", 'prompt_template': '', 'bot_template': '<|im_start|>assistant\n{bot_name}: {message}<|im_end|>\n', 'user_template': '<|im_start|>user\n{message}<|im_end|>\n', 'response_template': '<|im_start|>assistant\n{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-reward-dpo-0e8a-16483-v1-uploader
Waiting for job on chaiml-reward-dpo-0e8a-16483-v1-uploader to finish
chaiml-reward-dpo-0e8a-16483-v1-uploader: /root/miniconda3/envs/nvidia/lib/python3.11/site-packages/mk1/__init__.py:1: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
chaiml-reward-dpo-0e8a-16483-v1-uploader: __import__('pkg_resources').declare_namespace(__name__)
chaiml-reward-dpo-0e8a-16483-v1-uploader: ╔═════════════════════════════════════════════════════════════════════╗
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ ██████ ██████ █████ ████ ████ ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ ░░██████ ██████ ░░███ ███░ ░░███ ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ ░███░█████░███ ░███ ███ ░███ ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ ░███░░███ ░███ ░███████ ░███ ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ ░███ ░░░ ░███ ░███░░███ ░███ ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ ░███ ░███ ░███ ░░███ ░███ ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ █████ █████ █████ ░░████ █████ ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ ░░░░░ ░░░░░ ░░░░░ ░░░░ ░░░░░ ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ Version: 0.30.6+torch280 ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ Features: FLYWHEEL, CUDA ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ Copyright 2023-2025 MK ONE TECHNOLOGIES Inc. ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ https://mk1.ai ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ The license key for the current software has been verified as ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ belonging to: ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ Chai Research Corp. ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ Expiration: 2028-03-31 23:59:59 ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ║ ║
chaiml-reward-dpo-0e8a-16483-v1-uploader: ╚═════════════════════════════════════════════════════════════════════╝
chaiml-reward-dpo-0e8a-16483-v1-uploader: Downloaded to shared memory in 207.902s
chaiml-reward-dpo-0e8a-16483-v1-uploader: Processed model ChaiML/reward-dpo-0e8a-chaiml-kimid-v9-opusdv_83165_v12-int4-mixed in 264.969s
chaiml-reward-dpo-0e8a-16483-v1-uploader: creating bucket guanaco-vllm-models
chaiml-reward-dpo-0e8a-16483-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-reward-dpo-0e8a-16483-v1-uploader: uploading /dev/shm/model_cache to s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/added_tokens.json s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/added_tokens.json
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/tokenizer_config.json
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/chat_template.jinja s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/chat_template.jinja
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/special_tokens_map.json
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/config.json s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/config.json
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/quantization_config.json s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/quantization_config.json
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/merges.txt s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/merges.txt
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/.gitattributes s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/.gitattributes
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/vocab.json s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/vocab.json
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model.safetensors.index.json
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/generation_config.json s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/generation_config.json
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/tokenizer.json
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00027-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00010-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00002-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00002-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00009-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00009-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00015-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00015-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00014-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00014-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00016-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00016-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00012-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00012-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00004-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00004-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00008-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00008-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00001-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00001-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00023-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00023-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00017-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00017-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00022-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00020-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00020-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00021-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00021-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00024-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00024-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00026-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00026-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00018-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00011-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00011-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00003-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00003-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00019-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00019-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00006-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00006-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00007-of-00027.safetensors
chaiml-reward-dpo-0e8a-16483-v1-uploader: cp /dev/shm/model_cache/model-00005-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-0e8a-16483-v1/model-00005-of-00027.safetensors
Job chaiml-reward-dpo-0e8a-16483-v1-uploader completed after 475.34s with status: succeeded
Stopping job with name chaiml-reward-dpo-0e8a-16483-v1-uploader
Pipeline stage VLLMUploader completed in 476.52s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.24s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-reward-dpo-0e8a-16483-v1
Waiting for inference service chaiml-reward-dpo-0e8a-16483-v1 to be ready
Inference service chaiml-reward-dpo-0e8a-16483-v1 ready after 456.1893229484558s
Pipeline stage VLLMDeployer completed in 457.24s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.4195210933685303s
Received healthy response to inference request in 2.3126864433288574s
Received healthy response to inference request in 2.605302572250366s
Received healthy response to inference request in 2.193502426147461s
Received healthy response to inference request in 2.1950957775115967s
Received healthy response to inference request in 2.0782222747802734s
Received healthy response to inference request in 2.2718448638916016s
Received healthy response to inference request in 2.3243393898010254s
Received healthy response to inference request in 2.4660863876342773s
Received healthy response to inference request in 2.388382911682129s
Received healthy response to inference request in 2.7228407859802246s
Received healthy response to inference request in 2.7556183338165283s
Received healthy response to inference request in 2.1219723224639893s
Received healthy response to inference request in 2.7640843391418457s
Received healthy response to inference request in 2.395376682281494s
Received healthy response to inference request in 2.609255313873291s
Received healthy response to inference request in 2.211695432662964s
Received healthy response to inference request in 2.2991697788238525s
Received healthy response to inference request in 2.2111518383026123s
Received healthy response to inference request in 2.128801107406616s
Received healthy response to inference request in 2.469968795776367s
Received healthy response to inference request in 2.1150293350219727s
Received healthy response to inference request in 2.248892307281494s
Received healthy response to inference request in 2.25408673286438s
Received healthy response to inference request in 2.278292655944824s
Received healthy response to inference request in 2.3481438159942627s
Received healthy response to inference request in 2.1192052364349365s
Received healthy response to inference request in 2.2686493396759033s
Received healthy response to inference request in 2.224749803543091s
Received healthy response to inference request in 2.1024603843688965s
30 requests
0 failed requests
5th percentile: 2.1081164121627807
10th percentile: 2.11878764629364
20th percentile: 2.180562162399292
30th percentile: 2.2115323543548584
40th percentile: 2.2520089626312254
50th percentile: 2.275068759918213
60th percentile: 2.3173476219177247
70th percentile: 2.3904810428619383
80th percentile: 2.4970355510711673
90th percentile: 2.726118540763855
95th percentile: 2.7602746367454527
99th percentile: 3.2294444346427924
mean time: 2.3634809494018554
Pipeline stage StressChecker completed in 75.93s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.18s
Shutdown handler de-registered
chaiml-reward-dpo-0e8a-_16483_v1 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 2967.97s
Shutdown handler de-registered
chaiml-reward-dpo-0e8a-_16483_v1 status is now inactive due to auto deactivation removed underperforming models
chaiml-reward-dpo-0e8a-_16483_v1 status is now protected due to ABTestQueueItem
chaiml-reward-dpo-0e8a-_16483_v1 status is now protected due to ABTestQueueItem