submission_id: function_fimon_2024-11-01
developer_uid: chai_backend_admin
celo_rating: 1248.87
display_name: function_fimon_2024-11-01
family_friendly_score: 0.5808
family_friendly_standard_error: 0.006978128115762852
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
is_internal_developer: True
model_group:
model_name: function_fimon_2024-11-01
num_battles: 12512
num_wins: 6319
ranking_group: single
status: inactive
submission_type: function
timestamp: 2024-11-01T21:19:26+00:00
us_pacific_date: 2024-11-01
win_ratio: 0.5050351662404092
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Failed to get response for submission blend_sadof_2024-10-11: ('http://chaiml-elo-alignment-run-3-v48-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'read tcp 127.0.0.1:36552->127.0.0.1:8080: read: connection reset by peer\n')
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.8637473583221436s
{"detail":"('http://chaiml-llama-8b-pairwis-8189-v27-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'read tcp 127.0.0.1:44054->127.0.0.1:8080: read: connection reset by peer\\n')"}
Received unhealthy response to inference request!
Received healthy response to inference request in 3.205599784851074s
Received healthy response to inference request in 4.035250663757324s
5 requests
2 failed requests
5th percentile: 2.9321178436279296
10th percentile: 3.0004883289337156
20th percentile: 3.137229299545288
30th percentile: 3.251374673843384
40th percentile: 3.342924451828003
50th percentile: 3.434474229812622
60th percentile: 3.674784803390503
70th percentile: 3.9150953769683836
80th percentile: 7.256474161148074
90th percentile: 13.698921155929566
95th percentile: 16.92014465332031
99th percentile: 19.49712345123291
mean time: 6.736088037490845
%s, retrying in %s seconds...
Received healthy response to inference request in 3.5982625484466553s
Received healthy response to inference request in 3.749544858932495s
Received healthy response to inference request in 3.320481538772583s
Received healthy response to inference request in 3.4003520011901855s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 3.5553882122039795s
5 requests
0 failed requests
5th percentile: 3.3364556312561033
10th percentile: 3.352429723739624
20th percentile: 3.384377908706665
30th percentile: 3.4313592433929445
40th percentile: 3.493373727798462
50th percentile: 3.5553882122039795
60th percentile: 3.5725379467010496
70th percentile: 3.58968768119812
80th percentile: 3.6285190105438234
90th percentile: 3.6890319347381593
95th percentile: 3.719288396835327
99th percentile: 3.7434935665130613
mean time: 3.5248058319091795
Pipeline stage StressChecker completed in 54.44s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 2.56s
Shutdown handler de-registered
function_fimon_2024-11-01 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 3465.34s
Shutdown handler de-registered
function_fimon_2024-11-01 status is now inactive due to auto deactivation removed underperforming models