developer_uid: chai_backend_admin
submission_id: function_jamin_2025-07-29
model_name: function_jamin_2025-07-29
model_group:
status: torndown
timestamp: 2025-07-29T23:06:37+00:00
num_battles: 5472
num_wins: 2787
celo_rating: 1290.92
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: function_jamin_2025-07-29
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-07-29
win_ratio: 0.5093201754385965
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Failed to get response for submission chaiml-grpo-nis-12bsftg_71959_v4: HTTPConnectionPool(host='chaiml-grpo-nis-12bsftg-71959-v4-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.663339138031006s
Received healthy response to inference request in 8.717352390289307s
Received healthy response to inference request in 4.2051966190338135s
Received healthy response to inference request in 3.353924512863159s
5 requests
1 failed requests
5th percentile: 3.52417893409729
10th percentile: 3.694433355331421
20th percentile: 4.034942197799682
30th percentile: 4.296825122833252
40th percentile: 4.4800821304321286
50th percentile: 4.663339138031006
60th percentile: 6.284944438934326
70th percentile: 7.906549739837645
80th percentile: 10.997952699661257
90th percentile: 15.559153318405151
95th percentile: 17.839753627777096
99th percentile: 19.664233875274657
mean time: 8.212033319473267
%s, retrying in %s seconds...
Received healthy response to inference request in 2.1366660594940186s
Received healthy response to inference request in 2.0951662063598633s
Received healthy response to inference request in 2.3543543815612793s
Received healthy response to inference request in 4.159519910812378s
Received healthy response to inference request in 3.0626816749572754s
5 requests
0 failed requests
5th percentile: 2.1034661769866942
10th percentile: 2.111766147613525
20th percentile: 2.1283660888671876
30th percentile: 2.1802037239074705
40th percentile: 2.267279052734375
50th percentile: 2.3543543815612793
60th percentile: 2.6376852989196777
70th percentile: 2.921016216278076
80th percentile: 3.2820493221282963
90th percentile: 3.720784616470337
95th percentile: 3.940152263641357
99th percentile: 4.115646381378173
mean time: 2.761677646636963
Pipeline stage StressChecker completed in 57.24s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.72s
Shutdown handler de-registered
function_jamin_2025-07-29 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
Received signal 15, running shutdown handler
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
Received signal 15, running shutdown handler
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
clean up pipeline due to error=DeploymentChecksError('None: None')
Shutdown handler de-registered
function_jamin_2025-07-29 status is now inactive due to auto deactivation removed underperforming models
function_jamin_2025-07-29 status is now torndown due to DeploymentManager action