developer_uid: rirv938
submission_id: function_mutol_2026-02-20
model_name: ab_test
model_group:
status: torndown
timestamp: 2026-02-23T21:31:40+00:00
num_battles: 10847
num_wins: 6242
celo_rating: 1352.96
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: ab_test
is_internal_developer: True
ranking_group: single
us_pacific_date: 2026-02-20
win_ratio: 0.5754586521618881
generation_params: {'temperature': 0.9, 'top_p': 0.9, 'min_p': 0.05, 'top_k': 80, 'presence_penalty': 0.5, 'frequency_penalty': 0.5, 'stopping_words': ['</s>', '\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.415735960006714s
Received healthy response to inference request in 2.371445655822754s
Received healthy response to inference request in 4.031320571899414s
Received healthy response to inference request in 3.871055841445923s
Received healthy response to inference request in 5.026165008544922s
Received healthy response to inference request in 3.701671600341797s
Received healthy response to inference request in 3.397217273712158s
Received healthy response to inference request in 3.6456170082092285s
Received healthy response to inference request in 6.1965906620025635s
Received healthy response to inference request in 3.8880763053894043s
10 requests
0 failed requests
5th percentile: 2.391376292705536
10th percentile: 2.411306929588318
20th percentile: 3.2009210109710695
30th percentile: 3.571097087860107
40th percentile: 3.6792497634887695
50th percentile: 3.78636372089386
60th percentile: 3.8778640270233153
70th percentile: 3.9310495853424072
80th percentile: 4.230289459228516
90th percentile: 5.1432075738906855
95th percentile: 5.669899117946623
99th percentile: 6.091252353191376
mean time: 3.8544895887374877
Pipeline stage StressChecker completed in 39.96s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.66s
Shutdown handler de-registered
function_mutol_2026-02-20 status is now deployed due to DeploymentManager action
function_mutol_2026-02-20 status is now inactive due to auto deactivation removed underperforming models
function_mutol_2026-02-20 status is now torndown due to DeploymentManager action