function_lotel_2025-12-18

developer_uid: chai_evaluation_service

submission_id: function_lotel_2025-12-18

model_name: richard

model_group:

status: torndown

timestamp: 2025-12-21T17:01:10+00:00

num_battles: 10213

num_wins: 5016

celo_rating: 1287.02

family_friendly_score: 0.0

family_friendly_standard_error: 0.0

submission_type: function

display_name: richard

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-12-21

win_ratio: 0.4911387447370998

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.203855514526367s
Received healthy response to inference request in 3.7339301109313965s
Received healthy response to inference request in 1.9540462493896484s
Received healthy response to inference request in 4.113287448883057s
Received healthy response to inference request in 3.019322395324707s
Received healthy response to inference request in 3.379706382751465s
Received healthy response to inference request in 3.573004961013794s
Received healthy response to inference request in 2.435417652130127s
Received healthy response to inference request in 2.7552695274353027s
Received healthy response to inference request in 2.913762331008911s
10 requests
0 failed requests
5th percentile: 2.0664604187011717
10th percentile: 2.1788745880126954
20th percentile: 2.389105224609375
30th percentile: 2.65931396484375
40th percentile: 2.8503652095794676
50th percentile: 2.966542363166809
60th percentile: 3.16347599029541
70th percentile: 3.4376959562301637
80th percentile: 3.6051899909973146
90th percentile: 3.7718658447265625
95th percentile: 3.942576646804809
99th percentile: 4.079145288467407
mean time: 3.0081602573394775
Pipeline stage StressChecker completed in 31.54s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.70s
Shutdown handler de-registered
function_lotel_2025-12-18 status is now deployed due to DeploymentManager action
function_lotel_2025-12-18 status is now inactive due to auto deactivation removed underperforming models
function_lotel_2025-12-18 status is now torndown due to DeploymentManager action