developer_uid: chai_backend_admin
submission_id: function_nupot_2026-01-26
model_name: function_nupot_2026-01-26
model_group:
status: inactive
timestamp: 2026-01-26T03:26:52+00:00
num_battles: 10940
num_wins: 3727
celo_rating: 1269.87
family_friendly_score: 0.6416
family_friendly_standard_error: 0.006781584475622198
submission_type: function
display_name: function_nupot_2026-01-26
is_internal_developer: True
ranking_group: single
us_pacific_date: 2026-01-25
win_ratio: 0.3406764168190128
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.0208299160003662s
Received healthy response to inference request in 1.029266357421875s
Received healthy response to inference request in 1.0557324886322021s
Received healthy response to inference request in 1.0516324043273926s
Received healthy response to inference request in 1.5264461040496826s
Received healthy response to inference request in 0.8427448272705078s
Received healthy response to inference request in 1.0091307163238525s
Received healthy response to inference request in 1.7835893630981445s
Received healthy response to inference request in 0.9863617420196533s
read tcp 127.0.0.1:50180->127.0.0.1:8080: read: connection reset by peer
Received unhealthy response to inference request!
10 requests
1 failed requests
5th percentile: 0.5938144326210022
10th percentile: 0.7974847555160522
20th percentile: 0.9576383590698242
30th percentile: 1.0023000240325928
40th percentile: 1.0161502361297607
50th percentile: 1.0250481367111206
60th percentile: 1.038212776184082
70th percentile: 1.0528624296188354
80th percentile: 1.1498752117156983
90th percentile: 1.5521604299545286
95th percentile: 1.6678748965263364
99th percentile: 1.760446469783783
mean time: 1.069587802886963
%s, retrying in %s seconds...
Received healthy response to inference request in 0.9795999526977539s
Received healthy response to inference request in 1.3694753646850586s
Received healthy response to inference request in 1.2507965564727783s
Received healthy response to inference request in 1.5108115673065186s
Received healthy response to inference request in 0.8135907649993896s
Received healthy response to inference request in 1.0566253662109375s
Received healthy response to inference request in 1.1322004795074463s
Received healthy response to inference request in 1.1017446517944336s
Received healthy response to inference request in 0.9895658493041992s
Received healthy response to inference request in 3.9855659008026123s
10 requests
0 failed requests
5th percentile: 0.8882948994636536
10th percentile: 0.9629990339279175
20th percentile: 0.9875726699829102
30th percentile: 1.036507511138916
40th percentile: 1.0836969375610352
50th percentile: 1.11697256565094
60th percentile: 1.179638910293579
70th percentile: 1.2864001989364624
80th percentile: 1.3977426052093507
90th percentile: 1.758287000656127
95th percentile: 2.871926450729368
99th percentile: 3.762838010787964
mean time: 1.4189976453781128
Pipeline stage StressChecker completed in 27.66s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.62s
Shutdown handler de-registered
function_nupot_2026-01-26 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 1044.48s
Shutdown handler de-registered
function_nupot_2026-01-26 status is now inactive due to auto deactivation removed underperforming models