function_doruk_2025-12-19

developer_uid: chai_backend_admin

submission_id: function_doruk_2025-12-19

model_name: function_doruk_2025-12-19

model_group:

status: torndown

timestamp: 2025-12-22T23:01:27+00:00

num_battles: 7089

num_wins: 3745

celo_rating: 1312.69

family_friendly_score: 0.0

family_friendly_standard_error: 0.0

submission_type: function

display_name: function_doruk_2025-12-19

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-12-22

win_ratio: 0.5282832557483425

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 17.03942060470581s
Received healthy response to inference request in 8.876969575881958s
Received healthy response to inference request in 16.795081853866577s
Received healthy response to inference request in 16.565879821777344s
Received healthy response to inference request in 11.361066341400146s
Received healthy response to inference request in 18.17388415336609s
Received healthy response to inference request in 15.213181257247925s
Received healthy response to inference request in 19.246954679489136s
Received healthy response to inference request in 4.360692024230957s
Received healthy response to inference request in 2.9930436611175537s
10 requests
0 failed requests
5th percentile: 3.608485424518585
10th percentile: 4.2239271879196165
20th percentile: 7.973714065551758
30th percentile: 10.61583731174469
40th percentile: 13.672335290908814
50th percentile: 15.889530539512634
60th percentile: 16.657560634613038
70th percentile: 16.868383479118346
80th percentile: 17.266313314437866
90th percentile: 18.28119120597839
95th percentile: 18.764072942733762
99th percentile: 19.15037833213806
mean time: 13.06261739730835
%s, retrying in %s seconds...
Received healthy response to inference request in 3.1213321685791016s
Received healthy response to inference request in 10.749025821685791s
Received healthy response to inference request in 8.893458127975464s
Received healthy response to inference request in 6.787006616592407s
Received healthy response to inference request in 14.628560781478882s
Received healthy response to inference request in 15.490115880966187s
Received healthy response to inference request in 8.491943120956421s
Received healthy response to inference request in 2.8732924461364746s
Received healthy response to inference request in 1.451608419418335s
Received healthy response to inference request in 3.9448494911193848s
10 requests
0 failed requests
5th percentile: 2.091366231441498
10th percentile: 2.7311240434646606
20th percentile: 3.071724224090576
30th percentile: 3.6977942943572994
40th percentile: 5.650143766403199
50th percentile: 7.639474868774414
60th percentile: 8.652549123764038
70th percentile: 9.450128436088562
80th percentile: 11.52493281364441
90th percentile: 14.714716291427612
95th percentile: 15.102416086196898
99th percentile: 15.41257592201233
mean time: 7.643119287490845
%s, retrying in %s seconds...
Received healthy response to inference request in 4.778011798858643s
Received healthy response to inference request in 18.971598386764526s
Received healthy response to inference request in 9.057569742202759s
Received healthy response to inference request in 1.5787417888641357s
Received healthy response to inference request in 3.7083065509796143s
Received healthy response to inference request in 7.507156848907471s
Received healthy response to inference request in 5.417274475097656s
Received healthy response to inference request in 6.142297744750977s
Received healthy response to inference request in 2.816720485687256s
Received healthy response to inference request in 8.017261981964111s
10 requests
0 failed requests
5th percentile: 2.13583220243454
10th percentile: 2.6929226160049438
20th percentile: 3.5299893379211427
30th percentile: 4.457100224494933
40th percentile: 5.161569404602051
50th percentile: 5.779786109924316
60th percentile: 6.688241386413574
70th percentile: 7.660188388824463
80th percentile: 8.22532353401184
90th percentile: 10.048972606658932
95th percentile: 14.51028549671172
99th percentile: 18.079335808753967
mean time: 6.799493980407715
Pipeline stage StressChecker completed in 280.56s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.60s
Shutdown handler de-registered
function_doruk_2025-12-19 status is now deployed due to DeploymentManager action
function_doruk_2025-12-19 status is now inactive due to auto deactivation removed underperforming models
function_doruk_2025-12-19 status is now torndown due to DeploymentManager action