developer_uid: rirv938
submission_id: function_pokur_2026-02-20
model_name: ab_test
model_group:
status: inactive
timestamp: 2026-02-20T20:37:37+00:00
num_battles: 10595
num_wins: 5995
celo_rating: 1341.64
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: ab_test
is_internal_developer: True
ranking_group: single
us_pacific_date: 2026-02-20
win_ratio: 0.5658329400660689
generation_params: {'temperature': 0.9, 'top_p': 0.9, 'min_p': 0.05, 'top_k': 80, 'presence_penalty': 0.5, 'frequency_penalty': 0.5, 'stopping_words': ['</s>', '\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.874955177307129s
Received healthy response to inference request in 1.2269892692565918s
Received healthy response to inference request in 1.9582769870758057s
Received healthy response to inference request in 1.289156198501587s
Received healthy response to inference request in 1.5072038173675537s
Received healthy response to inference request in 1.5596778392791748s
Received healthy response to inference request in 1.308997631072998s
Received healthy response to inference request in 1.830930471420288s
Received healthy response to inference request in 1.7742066383361816s
Received healthy response to inference request in 1.3321211338043213s
10 requests
0 failed requests
5th percentile: 1.2549643874168397
10th percentile: 1.2829395055770874
20th percentile: 1.3050293445587158
30th percentile: 1.3251840829849244
40th percentile: 1.4371707439422607
50th percentile: 1.5334408283233643
60th percentile: 1.6454893589019775
70th percentile: 1.7912237882614135
80th percentile: 1.8397354125976562
90th percentile: 1.8832873582839966
95th percentile: 1.920782172679901
99th percentile: 1.9507780241966248
mean time: 1.566251516342163
Pipeline stage StressChecker completed in 16.94s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.59s
Shutdown handler de-registered
function_pokur_2026-02-20 status is now deployed due to DeploymentManager action
function_pokur_2026-02-20 status is now inactive due to auto deactivation removed underperforming models