submission_id: function_figar_2024-09-14
developer_uid: chai_backend_admin
alignment_samples: 16806
alignment_score: -0.8022447166012129
celo_rating: 1266.41
display_name: dpo_with_ava_reward_400k_v1
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
generation_params: {'temperature': 0.95, 'top_p': 1.0, 'min_p': 0.08, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n', '<|eot_id|>'], 'max_input_tokens': 512, 'best_of': 16, 'max_output_tokens': 64}
is_internal_developer: True
model_group:
model_name: dpo_with_ava_reward_400k_v1
num_battles: 16806
num_wins: 8591
propriety_score: 0.7548882681564246
propriety_total_count: 1432.0
ranking_group: single
status: inactive
submission_type: function
timestamp: 2024-09-14T21:24:50+00:00
us_pacific_date: 2024-09-14
win_ratio: 0.5111864810186838
Download Preference Data
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.6551783084869385s
Received healthy response to inference request in 5.624939203262329s
Received healthy response to inference request in 5.161245107650757s
Received healthy response to inference request in 4.025627613067627s
Received healthy response to inference request in 4.911656856536865s
5 requests
0 failed requests
5th percentile: 3.7292681694030763
10th percentile: 3.803358030319214
20th percentile: 3.951537752151489
30th percentile: 4.202833461761474
40th percentile: 4.55724515914917
50th percentile: 4.911656856536865
60th percentile: 5.011492156982422
70th percentile: 5.111327457427978
80th percentile: 5.2539839267730715
90th percentile: 5.4394615650177
95th percentile: 5.5322003841400145
99th percentile: 5.606391439437866
mean time: 4.675729417800904
%s, retrying in %s seconds...
Received healthy response to inference request in 4.309017658233643s
Received healthy response to inference request in 5.174866199493408s
Received healthy response to inference request in 3.9432218074798584s
Received healthy response to inference request in 3.3936426639556885s
Received healthy response to inference request in 3.828864097595215s
5 requests
0 failed requests
5th percentile: 3.4806869506835936
10th percentile: 3.567731237411499
20th percentile: 3.7418198108673097
30th percentile: 3.8517356395721434
40th percentile: 3.897478723526001
50th percentile: 3.9432218074798584
60th percentile: 4.089540147781372
70th percentile: 4.235858488082886
80th percentile: 4.482187366485596
90th percentile: 4.828526782989502
95th percentile: 5.0016964912414545
99th percentile: 5.140232257843017
mean time: 4.129922485351562
Pipeline stage StressChecker completed in 47.22s
Shutdown handler de-registered
function_figar_2024-09-14 status is now deployed due to DeploymentManager action
function_figar_2024-09-14 status is now inactive due to auto deactivation removed underperforming models