Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-7b07-69d4-linea-27339-v10-uploader
Waiting for job on chaiml-7b07-69d4-linea-27339-v10-uploader to finish
chaiml-7b07-69d4-linea-27339-v10-uploader: Using quantization_mode: fp8
chaiml-7b07-69d4-linea-27339-v10-uploader: Repo ChaiML/7b07-69d4-linear-w01-FP8 already ends in FP8. Skipping...
chaiml-7b07-69d4-linea-27339-v10-uploader: Checking if ChaiML/7b07-69d4-linear-w01-FP8 already exists in ChaiML
chaiml-7b07-69d4-linea-27339-v10-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-7b07-69d4-linea-27339-v10-uploader: Downloading snapshot of ChaiML/7b07-69d4-linear-w01-FP8...
chaiml-7b07-69d4-linea-27339-v10-uploader: cp /dev/shm/model_output/model-00005-of-00006.safetensors s3://guanaco-vllm-models/chaiml-7b07-69d4-linea-27339-v10/default/model-00005-of-00006.safetensors
chaiml-7b07-69d4-linea-27339-v10-uploader: cp /dev/shm/model_output/model-00002-of-00006.safetensors s3://guanaco-vllm-models/chaiml-7b07-69d4-linea-27339-v10/default/model-00002-of-00006.safetensors
chaiml-7b07-69d4-linea-27339-v10-uploader: cp /dev/shm/model_output/model-00003-of-00006.safetensors s3://guanaco-vllm-models/chaiml-7b07-69d4-linea-27339-v10/default/model-00003-of-00006.safetensors
chaiml-7b07-69d4-linea-27339-v10-uploader: cp /dev/shm/model_output/model-00001-of-00006.safetensors s3://guanaco-vllm-models/chaiml-7b07-69d4-linea-27339-v10/default/model-00001-of-00006.safetensors
chaiml-7b07-69d4-linea-27339-v10-uploader: cp /dev/shm/model_output/model-00004-of-00006.safetensors s3://guanaco-vllm-models/chaiml-7b07-69d4-linea-27339-v10/default/model-00004-of-00006.safetensors
Job chaiml-7b07-69d4-linea-27339-v10-uploader completed after 80.83s with status: succeeded
Stopping job with name chaiml-7b07-69d4-linea-27339-v10-uploader
Pipeline stage VLLMUploader completed in 81.27s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.60s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-7b07-69d4-linea-27339-v10
Waiting for inference service chaiml-7b07-69d4-linea-27339-v10 to be ready
Inference service chaiml-7b07-69d4-linea-27339-v10 ready after 160.36035895347595s
Pipeline stage VLLMDeployer completed in 160.88s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.9174020290374756s
Received healthy response to inference request in 2.5510194301605225s
Received healthy response to inference request in 2.7014124393463135s
Received healthy response to inference request in 2.6108782291412354s
Received healthy response to inference request in 3.077930212020874s
Received healthy response to inference request in 2.4667389392852783s
Received healthy response to inference request in 2.5951006412506104s
Received healthy response to inference request in 2.6884608268737793s
Received healthy response to inference request in 2.8928744792938232s
Received healthy response to inference request in 2.9167208671569824s
Received healthy response to inference request in 2.5152530670166016s
Received healthy response to inference request in 3.1632678508758545s
Received healthy response to inference request in 2.61173415184021s
Received healthy response to inference request in 2.89578914642334s
Received healthy response to inference request in 2.5229334831237793s
Received healthy response to inference request in 2.672089099884033s
Received healthy response to inference request in 2.5481629371643066s
Received healthy response to inference request in 2.509087562561035s
Received healthy response to inference request in 2.869251012802124s
Received healthy response to inference request in 2.911553382873535s
Received healthy response to inference request in 2.827932119369507s
Received healthy response to inference request in 2.610434055328369s
Received healthy response to inference request in 3.7728328704833984s
Received healthy response to inference request in 2.7288553714752197s
Received healthy response to inference request in 2.537153959274292s
Received healthy response to inference request in 3.3233730792999268s
Received healthy response to inference request in 2.5303597450256348s
Received healthy response to inference request in 2.8774397373199463s
Received healthy response to inference request in 2.9221818447113037s
30 requests
1 failed requests
5th percentile: 2.51186203956604
10th percentile: 2.5221654415130614
20th percentile: 2.5459611415863037
30th percentile: 2.6058340311050414
40th percentile: 2.647947120666504
50th percentile: 2.7151339054107666
60th percentile: 2.872526502609253
70th percentile: 2.9005184173583984
80th percentile: 2.918357992172241
90th percentile: 3.179278373718262
95th percentile: 3.570575964450835
99th percentile: 15.377951152324691
mean time: 3.3628764152526855
%s, retrying in %s seconds...
Received healthy response to inference request in 2.865279197692871s
Received healthy response to inference request in 2.498220205307007s
Received healthy response to inference request in 2.515921115875244s
Received healthy response to inference request in 2.53667950630188s
Received healthy response to inference request in 2.7328693866729736s
Received healthy response to inference request in 2.568546772003174s
Received healthy response to inference request in 2.526261806488037s
Received healthy response to inference request in 2.709300994873047s
Received healthy response to inference request in 3.3611481189727783s
Received healthy response to inference request in 2.8515374660491943s
Received healthy response to inference request in 3.031888961791992s
Received healthy response to inference request in 2.54948353767395s
Received healthy response to inference request in 2.9335386753082275s
Received healthy response to inference request in 2.9785852432250977s
Received healthy response to inference request in 3.11201810836792s
Received healthy response to inference request in 3.057241201400757s
Received healthy response to inference request in 2.7057223320007324s
Received healthy response to inference request in 3.1912097930908203s
Received healthy response to inference request in 2.701306104660034s
Received healthy response to inference request in 2.5627622604370117s
Received healthy response to inference request in 2.969439744949341s
Received healthy response to inference request in 3.3265347480773926s
Received healthy response to inference request in 2.9419569969177246s
Received healthy response to inference request in 2.8489978313446045s
Received healthy response to inference request in 3.5739572048187256s
Received healthy response to inference request in 2.837299346923828s
Received healthy response to inference request in 2.5637996196746826s
Received healthy response to inference request in 2.669050455093384s
Received healthy response to inference request in 2.93383526802063s
Received healthy response to inference request in 2.564197063446045s
30 requests
0 failed requests
5th percentile: 2.520574426651001
10th percentile: 2.5356377363204956
20th percentile: 2.5635921478271486
30th percentile: 2.6388993501663207
40th percentile: 2.707869529724121
50th percentile: 2.8431485891342163
60th percentile: 2.8925829887390138
70th percentile: 2.950201821327209
80th percentile: 3.036959409713745
90th percentile: 3.2047422885894776
95th percentile: 3.3455721020698546
99th percentile: 3.512242569923401
mean time: 2.8406196355819704
Pipeline stage StressChecker completed in 193.58s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.64s
Shutdown handler de-registered
chaiml-7b07-69d4-linea_27339_v10 status is now deployed due to DeploymentManager action
chaiml-7b07-69d4-linea_27339_v10 status is now inactive due to admin request