function_pugem_2025-07-09

developer_uid: chai_backend_admin

submission_id: function_pugem_2025-07-09

model_name: function_pugem_2025-07-09

model_group:

status: torndown

timestamp: 2025-07-09T23:41:30+00:00

num_battles: 6685

num_wins: 3381

celo_rating: 1290.11

family_friendly_score: 0.5378000000000001

family_friendly_standard_error: 0.007050832007642786

submission_type: function

display_name: function_pugem_2025-07-09

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-07-09

win_ratio: 0.5057591623036649

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 5.551666021347046s
Received healthy response to inference request in 3.3207054138183594s
Received healthy response to inference request in 5.239054441452026s
Received healthy response to inference request in 4.581224679946899s
Received healthy response to inference request in 2.8315110206604004s
5 requests
0 failed requests
5th percentile: 2.929349899291992
10th percentile: 3.027188777923584
20th percentile: 3.2228665351867676
30th percentile: 3.572809267044067
40th percentile: 4.077016973495484
50th percentile: 4.581224679946899
60th percentile: 4.84435658454895
70th percentile: 5.107488489151001
80th percentile: 5.301576757431031
90th percentile: 5.426621389389038
95th percentile: 5.489143705368042
99th percentile: 5.539161558151245
mean time: 4.304832315444946
%s, retrying in %s seconds...
Received healthy response to inference request in 3.5416526794433594s
Received healthy response to inference request in 2.9802751541137695s
Received healthy response to inference request in 3.420984983444214s
Received healthy response to inference request in 2.575442314147949s
Received healthy response to inference request in 2.21132230758667s
5 requests
0 failed requests
5th percentile: 2.284146308898926
10th percentile: 2.3569703102111816
20th percentile: 2.5026183128356934
30th percentile: 2.6564088821411134
40th percentile: 2.8183420181274412
50th percentile: 2.9802751541137695
60th percentile: 3.1565590858459474
70th percentile: 3.3328430175781247
80th percentile: 3.445118522644043
90th percentile: 3.493385601043701
95th percentile: 3.51751914024353
99th percentile: 3.5368259716033936
mean time: 2.9459354877471924
Pipeline stage StressChecker completed in 38.48s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.82s
Shutdown handler de-registered
function_pugem_2025-07-09 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 3563.70s
Shutdown handler de-registered
function_pugem_2025-07-09 status is now inactive due to auto deactivation removed underperforming models
function_pugem_2025-07-09 status is now torndown due to DeploymentManager action