function_gulam_2025-12-18

developer_uid: chai_evaluation_service

submission_id: function_gulam_2025-12-18

model_name: richard

model_group:

status: torndown

timestamp: 2025-12-21T14:01:07+00:00

num_battles: 8435

num_wins: 4188

celo_rating: 1290.94

family_friendly_score: 0.0

family_friendly_standard_error: 0.0

submission_type: function

display_name: richard

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-12-21

win_ratio: 0.4965026674570243

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.2740299701690674s
Received healthy response to inference request in 3.1622490882873535s
Received healthy response to inference request in 2.310783863067627s
Received healthy response to inference request in 3.1514315605163574s
Received healthy response to inference request in 1.9948272705078125s
Received healthy response to inference request in 2.0614473819732666s
Received healthy response to inference request in 4.951943397521973s
Received healthy response to inference request in 4.244047164916992s
Received healthy response to inference request in 3.147312641143799s
Received healthy response to inference request in 2.0051417350769043s
10 requests
0 failed requests
5th percentile: 1.999468779563904
10th percentile: 2.0041102886199953
20th percentile: 2.0501862525939942
30th percentile: 2.210255193710327
40th percentile: 2.296082305908203
50th percentile: 2.729048252105713
60th percentile: 3.1489602088928224
70th percentile: 3.154676818847656
80th percentile: 3.3786087036132812
90th percentile: 4.31483678817749
95th percentile: 4.633390092849731
99th percentile: 4.888232736587525
mean time: 2.9303214073181154
Pipeline stage StressChecker completed in 30.86s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.61s
Shutdown handler de-registered
function_gulam_2025-12-18 status is now deployed due to DeploymentManager action
function_gulam_2025-12-18 status is now inactive due to auto deactivation removed underperforming models
function_gulam_2025-12-18 status is now torndown due to DeploymentManager action