developer_uid: chai_evaluation_service
submission_id: chaiml-ocean-life25051_60844_v42
model_name: richard
model_group: ChaiML/ocean-life2505130
status: inactive
timestamp: 2025-12-03T18:45:17+00:00
num_battles: 2426
num_wins: 1134
celo_rating: 1270.65
family_friendly_score: 0.5444
family_friendly_standard_error: 0.0070431333935969155
submission_type: basic
model_repo: ChaiML/ocean-life250513094323_sft
model_architecture: MistralForCausalLM
model_num_parameters: 24096691200.0
best_of: 8
max_input_tokens: 1024
max_output_tokens: 64
reward_model: default
display_name: richard
ineligible_reason: num_battles<5000
is_internal_developer: True
language_model: ChaiML/ocean-life250513094323_sft
model_size: 24B
ranking_group: single
us_pacific_date: 2025-12-03
win_ratio: 0.4674361088211047
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['You:', '####\n', '\n', '####', '</s>'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '', 'prompt_template': '', 'bot_template': '{bot_name}: {message}</s>\n', 'user_template': 'You: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
run pipeline %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.49s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-ocean-life25051-60844-v42
Waiting for inference service chaiml-ocean-life25051-60844-v42 to be ready
Tearing down inference service chaiml-ocean-life25051-60844-v42
%s, retrying in %s seconds...
Creating inference service chaiml-ocean-life25051-60844-v42
Waiting for inference service chaiml-ocean-life25051-60844-v42 to be ready
Connection pool is full, discarding connection: %s. Connection pool size: %s
Inference service chaiml-ocean-life25051-60844-v42 ready after 291.8686044216156s
Pipeline stage VLLMDeployer completed in 564.75s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.175173282623291s
Received healthy response to inference request in 2.46797776222229s
Received healthy response to inference request in 2.510632276535034s
Received healthy response to inference request in 2.5941827297210693s
Received healthy response to inference request in 2.1912591457366943s
Received healthy response to inference request in 2.185694932937622s
Received healthy response to inference request in 2.445712089538574s
Received healthy response to inference request in 2.189444065093994s
Received healthy response to inference request in 2.5416274070739746s
Received healthy response to inference request in 2.216601610183716s
Received healthy response to inference request in 2.4276809692382812s
Received healthy response to inference request in 2.2737951278686523s
Received healthy response to inference request in 2.1655163764953613s
Received healthy response to inference request in 2.325577735900879s
Received healthy response to inference request in 2.4898855686187744s
Received healthy response to inference request in 2.4214866161346436s
Received healthy response to inference request in 2.2715249061584473s
Received healthy response to inference request in 2.6501076221466064s
Received healthy response to inference request in 2.5604379177093506s
Received healthy response to inference request in 2.1647961139678955s
Received healthy response to inference request in 2.9900600910186768s
Received healthy response to inference request in 2.74835205078125s
Received healthy response to inference request in 2.400413751602173s
Received healthy response to inference request in 2.3023297786712646s
Received healthy response to inference request in 2.2162201404571533s
Received healthy response to inference request in 2.163818120956421s
Received healthy response to inference request in 2.346752643585205s
Received healthy response to inference request in 2.4239211082458496s
Received healthy response to inference request in 2.1882638931274414s
Received healthy response to inference request in 2.2915284633636475s
30 requests
0 failed requests
5th percentile: 2.165120232105255
10th percentile: 2.174207592010498
20th percentile: 2.1892080307006836
30th percentile: 2.216487169265747
40th percentile: 2.2844351291656495
50th percentile: 2.336165189743042
60th percentile: 2.422460412979126
70th percentile: 2.452391791343689
80th percentile: 2.516831302642822
90th percentile: 2.599775218963623
95th percentile: 2.7041420578956603
99th percentile: 2.9199647593498232
mean time: 2.3780258099238076
Pipeline stage StressChecker completed in 74.77s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.61s
Shutdown handler de-registered
chaiml-ocean-life25051_60844_v42 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 5879.41s
Shutdown handler de-registered