function_mujim_2025-05-13

developer_uid: chai_backend_admin

submission_id: function_mujim_2025-05-13

model_name: function_mujim_2025-05-13

model_group:

status: torndown

timestamp: 2025-05-13T21:28:25+00:00

num_battles: 7502

num_wins: 3918

celo_rating: 1290.45

family_friendly_score: 0.5484

family_friendly_standard_error: 0.007037861038696345

submission_type: function

display_name: function_mujim_2025-05-13

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-05-13

win_ratio: 0.5222607304718742

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.899233341217041s
Received healthy response to inference request in 3.513913869857788s
Received healthy response to inference request in 4.1686553955078125s
Received healthy response to inference request in 2.503638744354248s
Received healthy response to inference request in 3.2388968467712402s
5 requests
0 failed requests
5th percentile: 2.6506903648376463
10th percentile: 2.797741985321045
20th percentile: 3.091845226287842
30th percentile: 3.29390025138855
40th percentile: 3.403907060623169
50th percentile: 3.513913869857788
60th percentile: 3.6680416584014894
70th percentile: 3.8221694469451903
80th percentile: 3.9531177520751952
90th percentile: 4.060886573791504
95th percentile: 4.114770984649658
99th percentile: 4.157878513336182
mean time: 3.464867639541626
%s, retrying in %s seconds...
Received healthy response to inference request in 3.5126819610595703s
Received healthy response to inference request in 3.8020989894866943s
Received healthy response to inference request in 3.7829647064208984s
Received healthy response to inference request in 2.839334726333618s
Received healthy response to inference request in 2.4896082878112793s
5 requests
0 failed requests
5th percentile: 2.5595535755157472
10th percentile: 2.6294988632202148
20th percentile: 2.76938943862915
30th percentile: 2.9740041732788085
40th percentile: 3.2433430671691896
50th percentile: 3.5126819610595703
60th percentile: 3.6207950592041014
70th percentile: 3.728908157348633
80th percentile: 3.7867915630340576
90th percentile: 3.794445276260376
95th percentile: 3.798272132873535
99th percentile: 3.8013336181640627
mean time: 3.285337734222412
%s, retrying in %s seconds...
Received healthy response to inference request in 4.123152017593384s
Received healthy response to inference request in 3.681600570678711s
Received healthy response to inference request in 2.7115819454193115s
Received healthy response to inference request in 3.4875001907348633s
Received healthy response to inference request in 3.2981293201446533s
5 requests
0 failed requests
5th percentile: 2.82889142036438
10th percentile: 2.946200895309448
20th percentile: 3.180819845199585
30th percentile: 3.3360034942626955
40th percentile: 3.4117518424987794
50th percentile: 3.4875001907348633
60th percentile: 3.5651403427124024
70th percentile: 3.6427804946899416
80th percentile: 3.7699108600616458
90th percentile: 3.9465314388275146
95th percentile: 4.034841728210449
99th percentile: 4.105489959716797
mean time: 3.4603928089141847
Pipeline stage StressChecker completed in 54.23s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.81s
Shutdown handler de-registered
function_mujim_2025-05-13 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 3172.46s
Shutdown handler de-registered
function_mujim_2025-05-13 status is now inactive due to auto deactivation removed underperforming models
function_mujim_2025-05-13 status is now torndown due to DeploymentManager action