function_hulir_2025-06-07

developer_uid: rirv938

submission_id: function_hulir_2025-06-07

model_name: dpo_data_collection

model_group:

status: torndown

timestamp: 2025-06-07T18:29:28+00:00

num_battles: 9922

num_wins: 5000

celo_rating: 1283.93

family_friendly_score: 0.5302

family_friendly_standard_error: 0.007058157833315999

submission_type: function

display_name: dpo_data_collection

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-06-07

win_ratio: 0.5039306591413022

generation_params: {'temperature': 0.9, 'top_p': 0.9, 'min_p': 0.05, 'top_k': 80, 'presence_penalty': 0.5, 'frequency_penalty': 0.5, 'stopping_words': ['\n', '</s>'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.7017605304718018s
Received healthy response to inference request in 4.288498640060425s
Received healthy response to inference request in 5.268012285232544s
Received healthy response to inference request in 4.432088375091553s
Received healthy response to inference request in 3.3286876678466797s
5 requests
0 failed requests
5th percentile: 3.403302240371704
10th percentile: 3.4779168128967286
20th percentile: 3.6271459579467775
30th percentile: 3.8191081523895263
40th percentile: 4.053803396224976
50th percentile: 4.288498640060425
60th percentile: 4.345934534072876
70th percentile: 4.403370428085327
80th percentile: 4.599273157119751
90th percentile: 4.933642721176147
95th percentile: 5.1008275032043455
99th percentile: 5.234575328826904
mean time: 4.203809499740601
%s, retrying in %s seconds...
Received healthy response to inference request in 3.2783520221710205s
Received healthy response to inference request in 2.865673780441284s
Received healthy response to inference request in 3.7700133323669434s
Received healthy response to inference request in 7.523570775985718s
Received healthy response to inference request in 3.361485004425049s
5 requests
0 failed requests
5th percentile: 2.9482094287872314
10th percentile: 3.0307450771331785
20th percentile: 3.1958163738250733
30th percentile: 3.294978618621826
40th percentile: 3.3282318115234375
50th percentile: 3.361485004425049
60th percentile: 3.524896335601807
70th percentile: 3.6883076667785644
80th percentile: 4.520724821090699
90th percentile: 6.022147798538208
95th percentile: 6.772859287261962
99th percentile: 7.3734284782409665
mean time: 4.159818983078003
Pipeline stage StressChecker completed in 43.85s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.62s
Shutdown handler de-registered
function_hulir_2025-06-07 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 4324.00s
Shutdown handler de-registered
function_hulir_2025-06-07 status is now inactive due to auto deactivation removed underperforming models
function_hulir_2025-06-07 status is now torndown due to DeploymentManager action