Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-merged-qwen-35-d-39140-v9-uploader
Waiting for job on chaiml-merged-qwen-35-d-39140-v9-uploader to finish
chaiml-merged-qwen-35-d-39140-v9-uploader: Using quantization_mode: none
chaiml-merged-qwen-35-d-39140-v9-uploader: Downloading snapshot of ChaiML/merged_qwen_35_dpo_lower_lr_v...
chaiml-merged-qwen-35-d-39140-v9-uploader: Downloaded in 33.068s
2026-03-27T17:24:13.206009+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
chaiml-merged-qwen-35-d-39140-v9-uploader: Processed model ChaiML/merged_qwen_35_dpo_lower_lr_v in 52.963s
chaiml-merged-qwen-35-d-39140-v9-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-merged-qwen-35-d-39140-v9-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v9/default
chaiml-merged-qwen-35-d-39140-v9-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v9/default/README.md
chaiml-merged-qwen-35-d-39140-v9-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v9/default/generation_config.json
chaiml-merged-qwen-35-d-39140-v9-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v9/default/tokenizer_config.json
chaiml-merged-qwen-35-d-39140-v9-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v9/default/model.safetensors.index.json
chaiml-merged-qwen-35-d-39140-v9-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v9/default/config.json
chaiml-merged-qwen-35-d-39140-v9-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v9/default/.gitattributes
chaiml-merged-qwen-35-d-39140-v9-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v9/default/tokenizer.json
chaiml-merged-qwen-35-d-39140-v9-uploader: cp /dev/shm/model_output/model-00002-of-00002.safetensors s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v9/default/model-00002-of-00002.safetensors
2026-03-27T17:25:13.293952+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
chaiml-merged-qwen-35-d-39140-v9-uploader: cp /dev/shm/model_output/model-00001-of-00002.safetensors s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v9/default/model-00001-of-00002.safetensors
Job chaiml-merged-qwen-35-d-39140-v9-uploader completed after 148.11s with status: succeeded
Stopping job with name chaiml-merged-qwen-35-d-39140-v9-uploader
Pipeline stage VLLMUploader completed in 148.83s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 3.88s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-merged-qwen-35-d-39140-v9
Waiting for inference service chaiml-merged-qwen-35-d-39140-v9 to be ready
2026-03-27T17:26:13.467320+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
2026-03-27T17:27:13.561616+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
2026-03-27T17:28:13.656509+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
2026-03-27T17:29:13.751341+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
2026-03-27T17:30:13.849821+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
2026-03-27T17:31:13.953731+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
2026-03-27T17:32:14.050978+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
2026-03-27T17:33:14.158001+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
2026-03-27T17:34:14.268720+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
2026-03-27T17:35:14.373989+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
2026-03-27T17:36:14.477645+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
Inference service chaiml-merged-qwen-35-d-39140-v9 ready after 632.0987343788147s
Pipeline stage VLLMDeployer completed in 632.68s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-27T17:37:14.579517+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 10.482884407043457s
2026-03-27T17:38:14.715564+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.1230950355529785s
Received healthy response to inference request in 3.7836496829986572s
Received healthy response to inference request in 3.763746500015259s
Received healthy response to inference request in 10.69713306427002s
2026-03-27T17:39:14.888463+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.902937173843384s
Received healthy response to inference request in 3.918261766433716s
Received healthy response to inference request in 4.21041202545166s
Received healthy response to inference request in 10.508628845214844s
Received healthy response to inference request in 4.070334196090698s
Received healthy response to inference request in 3.9146602153778076s
Received healthy response to inference request in 3.9157166481018066s
Received healthy response to inference request in 3.9116263389587402s
2026-03-27T17:40:14.990729+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
Received healthy response to inference request in 3.8785605430603027s
Received healthy response to inference request in 3.925642251968384s
Received healthy response to inference request in 4.0064263343811035s
Received healthy response to inference request in 3.9154226779937744s
Received healthy response to inference request in 3.9604856967926025s
Received healthy response to inference request in 4.257299900054932s
Received healthy response to inference request in 3.8922533988952637s
Received healthy response to inference request in 4.052443504333496s
Received healthy response to inference request in 3.889389991760254s
30 requests
8 failed requests
5th percentile: 3.8263595700263977
10th percentile: 3.888307046890259
20th percentile: 3.909888505935669
30th percentile: 3.915628457069397
40th percentile: 3.946548318862915
50th percentile: 4.061388850212097
60th percentile: 4.229167175292969
70th percentile: 10.565180110931395
80th percentile: 20.128717184066772
90th percentile: 20.13893766403198
95th percentile: 20.148057079315187
99th percentile: 20.169211678504944
mean time: 8.936999766031901
%s, retrying in %s seconds...
Received healthy response to inference request in 4.209232807159424s
Received healthy response to inference request in 3.7021148204803467s
Received healthy response to inference request in 3.7774388790130615s
Received healthy response to inference request in 3.6687653064727783s
2026-03-27T17:41:15.085941+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
Received healthy response to inference request in 3.760653257369995s
Received healthy response to inference request in 3.79784893989563s
Received healthy response to inference request in 3.673415422439575s
Received healthy response to inference request in 3.9059319496154785s
Received healthy response to inference request in 3.7500193119049072s
Received healthy response to inference request in 3.798844814300537s
Received healthy response to inference request in 3.8258941173553467s
Received healthy response to inference request in 3.8291916847229004s
Received healthy response to inference request in 4.024697780609131s
Received healthy response to inference request in 3.8191940784454346s
Received healthy response to inference request in 3.771583318710327s
Received healthy response to inference request in 4.161022901535034s
Received healthy response to inference request in 3.8752174377441406s
Received healthy response to inference request in 3.9064762592315674s
Received healthy response to inference request in 3.8733370304107666s
2026-03-27T17:42:15.199803+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v9
Received healthy response to inference request in 3.9081532955169678s
Received healthy response to inference request in 3.9889400005340576s
Received healthy response to inference request in 3.9649624824523926s
Received healthy response to inference request in 3.9647674560546875s
Received healthy response to inference request in 3.9403276443481445s
Received healthy response to inference request in 3.949887752532959s
Received healthy response to inference request in 3.948578119277954s
Received healthy response to inference request in 3.9356484413146973s
Received healthy response to inference request in 4.003839015960693s
Received healthy response to inference request in 3.996027708053589s
Received healthy response to inference request in 3.9670915603637695s
30 requests
0 failed requests
5th percentile: 3.6863301515579225
10th percentile: 3.7452288627624513
20th percentile: 3.7762677669525146
30th percentile: 3.813089299201965
40th percentile: 3.85567889213562
50th percentile: 3.906204104423523
60th percentile: 3.9375201225280763
70th percentile: 3.9543516635894775
80th percentile: 3.9714612483978273
90th percentile: 4.005924892425537
95th percentile: 4.0996765971183775
99th percentile: 4.1952519345283505
mean time: 3.8899701197942096
Pipeline stage StressChecker completed in 398.36s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.68s
Shutdown handler de-registered
chaiml-merged-qwen-35-d_39140_v9 status is now deployed due to DeploymentManager action
chaiml-merged-qwen-35-d_39140_v9 status is now inactive due to auto deactivation removed underperforming models