developer_uid: chai_backend_admin
submission_id: function_pudub_2024-11-19
model_name: retune_with_base
status: inactive
timestamp: 2024-11-19T17:38:41+00:00
num_battles: 9627
num_wins: 5024
celo_rating: 1263.63
family_friendly_score: 0.5946
family_friendly_standard_error: 0.006943354232645775
submission_type: function
display_name: retune_with_base
is_internal_developer: True
ranking_group: single
us_pacific_date: 2024-11-19
win_ratio: 0.521865586371663
generation_params: {'temperature': 0.9, 'top_p': 0.9, 'min_p': 0.05, 'top_k': 80, 'presence_penalty': 0.5, 'frequency_penalty': 0.5, 'stopping_words': ['\n', '</s>'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.635545492172241s
Received healthy response to inference request in 2.7782227993011475s
Failed to get response for submission chaiml-small-story-telli_1710_v3: ('', 'read tcp> read: connection reset by peer\n')
Received healthy response to inference request in 2.998302698135376s
Received healthy response to inference request in 3.8296916484832764s
Received healthy response to inference request in 4.502360820770264s
5 requests
0 failed requests
5th percentile: 2.8222387790679933
10th percentile: 2.866254758834839
20th percentile: 2.95428671836853
30th percentile: 3.125751256942749
40th percentile: 3.380648374557495
50th percentile: 3.635545492172241
60th percentile: 3.713203954696655
70th percentile: 3.7908624172210694
80th percentile: 3.964225482940674
90th percentile: 4.233293151855468
95th percentile: 4.367826986312866
99th percentile: 4.475454053878784
mean time: 3.548824691772461
%s, retrying in %s seconds...
Received healthy response to inference request in 3.6652607917785645s
Received healthy response to inference request in 2.6962549686431885s
Received healthy response to inference request in 4.214165210723877s
Received healthy response to inference request in 18.116965770721436s
Received healthy response to inference request in 6.450097322463989s
5 requests
0 failed requests
5th percentile: 2.8900561332702637
10th percentile: 3.083857297897339
20th percentile: 3.4714596271514893
30th percentile: 3.775041675567627
40th percentile: 3.9946034431457518
50th percentile: 4.214165210723877
60th percentile: 5.108538055419921
70th percentile: 6.002910900115967
80th percentile: 8.78347101211548
90th percentile: 13.450218391418458
95th percentile: 15.783592081069944
99th percentile: 17.650291032791138
mean time: 7.028548812866211
%s, retrying in %s seconds...
Received healthy response to inference request in 3.694775342941284s
Received healthy response to inference request in 3.4502370357513428s
Received healthy response to inference request in 2.576160430908203s
Received healthy response to inference request in 3.071873426437378s
Received healthy response to inference request in 3.169931650161743s
5 requests
0 failed requests
5th percentile: 2.675303030014038
10th percentile: 2.774445629119873
20th percentile: 2.972730827331543
30th percentile: 3.091485071182251
40th percentile: 3.130708360671997
50th percentile: 3.169931650161743
60th percentile: 3.282053804397583
70th percentile: 3.3941759586334226
80th percentile: 3.499144697189331
90th percentile: 3.596960020065308
95th percentile: 3.645867681503296
99th percentile: 3.6849938106536864
mean time: 3.1925955772399903
Pipeline stage StressChecker completed in 74.23s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 5.41s
Shutdown handler de-registered
function_pudub_2024-11-19 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 3506.01s
Shutdown handler de-registered
function_pudub_2024-11-19 status is now inactive due to auto deactivation removed underperforming models