developer_uid: rirv938
submission_id: function_gifuf_2025-10-21
model_name: dpo_data_collection
model_group:
status: torndown
timestamp: 2025-10-21T17:51:07+00:00
num_battles: 5200
num_wins: 2721
celo_rating: 1289.82
family_friendly_score: 0.527
family_friendly_standard_error: 0.00706075066830716
submission_type: function
display_name: dpo_data_collection
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-10-21
win_ratio: 0.5232692307692308
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 80, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n', '</s>'], 'max_input_tokens': 512, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.7281341552734375s
Received healthy response to inference request in 4.8985559940338135s
Received healthy response to inference request in 2.1923885345458984s
Received healthy response to inference request in 3.0481505393981934s
5 requests
1 failed requests
5th percentile: 2.2995376586914062
10th percentile: 2.406686782836914
20th percentile: 2.6209850311279297
30th percentile: 2.7921374320983885
40th percentile: 2.920143985748291
50th percentile: 3.0481505393981934
60th percentile: 3.7883127212524412
70th percentile: 4.528474903106689
80th percentile: 7.940522956848147
90th percentile: 14.024456882476809
95th percentile: 17.066423845291133
99th percentile: 19.4999974155426
mean time: 6.595124006271362
%s, retrying in %s seconds...
Received healthy response to inference request in 2.2651407718658447s
Received healthy response to inference request in 2.2869811058044434s
Received healthy response to inference request in 2.460919141769409s
Received healthy response to inference request in 2.099146604537964s
Received healthy response to inference request in 3.119279384613037s
5 requests
0 failed requests
5th percentile: 2.13234543800354
10th percentile: 2.165544271469116
20th percentile: 2.2319419384002686
30th percentile: 2.2695088386535645
40th percentile: 2.278244972229004
50th percentile: 2.2869811058044434
60th percentile: 2.3565563201904296
70th percentile: 2.426131534576416
80th percentile: 2.5925911903381347
90th percentile: 2.855935287475586
95th percentile: 2.9876073360443116
99th percentile: 3.092944974899292
mean time: 2.4462934017181395
Pipeline stage StressChecker completed in 48.03s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.90s
Shutdown handler de-registered
function_gifuf_2025-10-21 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 3077.00s
Shutdown handler de-registered
function_gifuf_2025-10-21 status is now inactive due to auto deactivation removed underperforming models
function_gifuf_2025-10-21 status is now torndown due to DeploymentManager action