submission_id: function_denef_2024-09-14
developer_uid: chai_backend_admin
alignment_samples: 78081
alignment_score: 0.4315805428096154
celo_rating: 1214.77
display_name: mixtral_with_ava_reward_100k_v1
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
generation_params: {'temperature': 0.9, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 50, 'presence_penalty': 0.5, 'frequency_penalty': 0.5, 'stopping_words': ['\n', '</s>', '<|user|>', '###'], 'max_input_tokens': 512, 'best_of': 4, 'max_output_tokens': 64}
is_internal_developer: True
model_group:
model_name: mixtral_with_ava_reward_100k_v1
num_battles: 77953
num_wins: 41649
propriety_score: 0.7405742459396751
propriety_total_count: 6896.0
ranking_group: single
status: inactive
submission_type: function
timestamp: 2024-09-14T23:57:18+00:00
us_pacific_date: 2024-09-14
win_ratio: 0.5342834785062794
Download Preference Data
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.1670472621917725s
Received healthy response to inference request in 2.0048418045043945s
Received healthy response to inference request in 1.7474801540374756s
Received healthy response to inference request in 2.3051302433013916s
Received healthy response to inference request in 2.457984447479248s
5 requests
0 failed requests
5th percentile: 1.7989524841308593
10th percentile: 1.8504248142242432
20th percentile: 1.9533694744110108
30th percentile: 2.0372828960418703
40th percentile: 2.1021650791168214
50th percentile: 2.1670472621917725
60th percentile: 2.2222804546356203
70th percentile: 2.2775136470794677
80th percentile: 2.335701084136963
90th percentile: 2.3968427658081053
95th percentile: 2.4274136066436767
99th percentile: 2.451870279312134
mean time: 2.1364967823028564
Pipeline stage StressChecker completed in 16.19s
Shutdown handler de-registered
function_denef_2024-09-14 status is now deployed due to DeploymentManager action
function_denef_2024-09-14 status is now inactive due to auto deactivation removed underperforming models