Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-1007-tl-ads-run-2-gac-v7-uploader
Waiting for job on chaiml-1007-tl-ads-run-2-gac-v7-uploader to finish
chaiml-1007-tl-ads-run-2-gac-v7-uploader: Using quantization_mode: fp8
chaiml-1007-tl-ads-run-2-gac-v7-uploader: Checking if ChaiML/1007-tl-ads-run-2-gac-FP8 already exists in ChaiML
chaiml-1007-tl-ads-run-2-gac-v7-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-1007-tl-ads-run-2-gac-v7-uploader: Downloading snapshot of ChaiML/1007-tl-ads-run-2-gac-FP8...
2026-03-20T23:05:13.051022+00:00 monitor updated for chaiml-1007-tl-ads-aggr_36615_v6
2026-03-20T23:05:15.212389+00:00 monitor updated for chaiml-1007-tl-ads-run-2-gac_v7
chaiml-1007-tl-ads-run-2-gac-v7-uploader: creating bucket guanaco-vllm-models
chaiml-1007-tl-ads-run-2-gac-v7-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-1007-tl-ads-run-2-gac-v7-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-1007-tl-ads-run-2-gac-v7-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-1007-tl-ads-run-2-gac-v7-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-1007-tl-ads-run-2-gac-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-1007-tl-ads-run-2-gac-v7-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-1007-tl-ads-run-2-gac-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-1007-tl-ads-run-2-gac-v7-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-1007-tl-ads-run-2-gac-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-1007-tl-ads-run-2-gac-v7-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-1007-tl-ads-run-2-gac-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-1007-tl-ads-run-2-gac-v7-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-1007-tl-ads-run-2-gac-v7-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-1007-tl-ads-run-2-gac-v7-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-1007-tl-ads-run-2-gac-v7-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-1007-tl-ads-run-2-gac-v7-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-1007-tl-ads-run-2-gac-v7-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-1007-tl-ads-run-2-gac-v7-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-1007-tl-ads-run-2-gac-v7/default
chaiml-1007-tl-ads-run-2-gac-v7-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-1007-tl-ads-run-2-gac-v7/default/recipe.yaml
chaiml-1007-tl-ads-run-2-gac-v7-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-1007-tl-ads-run-2-gac-v7/default/config.json
chaiml-1007-tl-ads-run-2-gac-v7-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-1007-tl-ads-run-2-gac-v7/default/chat_template.jinja
chaiml-1007-tl-ads-run-2-gac-v7-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-1007-tl-ads-run-2-gac-v7/default/.gitattributes
chaiml-1007-tl-ads-run-2-gac-v7-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-1007-tl-ads-run-2-gac-v7/default/special_tokens_map.json
chaiml-1007-tl-ads-run-2-gac-v7-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-1007-tl-ads-run-2-gac-v7/default/model.safetensors.index.json
chaiml-1007-tl-ads-run-2-gac-v7-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-1007-tl-ads-run-2-gac-v7/default/generation_config.json
chaiml-1007-tl-ads-run-2-gac-v7-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-1007-tl-ads-run-2-gac-v7/default/tokenizer_config.json
chaiml-1007-tl-ads-run-2-gac-v7-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-1007-tl-ads-run-2-gac-v7/default/tokenizer.json
chaiml-1007-tl-ads-run-2-gac-v7-uploader: cp /dev/shm/model_output/model-00002-of-00003.safetensors s3://guanaco-vllm-models/chaiml-1007-tl-ads-run-2-gac-v7/default/model-00002-of-00003.safetensors
Job chaiml-1007-tl-ads-run-2-gac-v7-uploader completed after 73.66s with status: succeeded
Stopping job with name chaiml-1007-tl-ads-run-2-gac-v7-uploader
Pipeline stage VLLMUploader completed in 74.35s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.35s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-1007-tl-ads-run-2-gac-v7
Waiting for inference service chaiml-1007-tl-ads-run-2-gac-v7 to be ready
2026-03-20T23:06:13.189185+00:00 monitor updated for chaiml-1007-tl-ads-aggr_36615_v6
2026-03-20T23:06:15.346832+00:00 monitor updated for chaiml-1007-tl-ads-run-2-gac_v7
Inference service chaiml-1007-tl-ads-aggr-36615-v6 ready after 160.37178206443787s
Pipeline stage VLLMDeployer completed in 160.87s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.3519859313964844s
Received healthy response to inference request in 2.303105592727661s
Received healthy response to inference request in 2.3360178470611572s
Received healthy response to inference request in 2.47068190574646s
Received healthy response to inference request in 2.2431447505950928s
Received healthy response to inference request in 2.5983519554138184s
Received healthy response to inference request in 2.2674243450164795s
Received healthy response to inference request in 2.243253231048584s
Received healthy response to inference request in 2.3344945907592773s
Received healthy response to inference request in 2.3123679161071777s
2026-03-20T23:07:13.330923+00:00 monitor updated for chaiml-1007-tl-ads-aggr_36615_v6
Received healthy response to inference request in 2.290968656539917s
2026-03-20T23:07:15.481237+00:00 monitor updated for chaiml-1007-tl-ads-run-2-gac_v7
Received healthy response to inference request in 2.2181999683380127s
Received healthy response to inference request in 2.2375235557556152s
Received healthy response to inference request in 2.355177402496338s
Received healthy response to inference request in 2.27453875541687s
Received healthy response to inference request in 2.277688503265381s
Received healthy response to inference request in 2.235273838043213s
Received healthy response to inference request in 2.6124281883239746s
Received healthy response to inference request in 2.338275909423828s
Received healthy response to inference request in 2.4345953464508057s
Received healthy response to inference request in 2.2363219261169434s
Received healthy response to inference request in 2.228494167327881s
Received healthy response to inference request in 2.3223659992218018s
Received healthy response to inference request in 2.5668790340423584s
Received healthy response to inference request in 2.296765089035034s
Received healthy response to inference request in 3.0181686878204346s
Received healthy response to inference request in 2.269207239151001s
Received healthy response to inference request in 2.242203712463379s
Received healthy response to inference request in 2.3587756156921387s
Received healthy response to inference request in 2.2577638626098633s
30 requests
0 failed requests
5th percentile: 2.23154501914978
10th percentile: 2.2362171173095704
20th percentile: 2.24295654296875
30th percentile: 2.264526200294495
40th percentile: 2.2764286041259765
50th percentile: 2.2999353408813477
60th percentile: 2.327217435836792
70th percentile: 2.342388916015625
80th percentile: 2.3739395618438723
90th percentile: 2.5700263261795047
95th percentile: 2.6060938835144043
99th percentile: 2.9005039429664614
mean time: 2.351081450780233
Pipeline stage StressChecker completed in 75.35s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.80s
Shutdown handler de-registered
chaiml-1007-tl-ads-aggr_36615_v6 status is now deployed due to DeploymentManager action
Inference service chaiml-1007-tl-ads-run-2-gac-v7 ready after 160.49633884429932s
Pipeline stage VLLMDeployer completed in 161.08s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.4128012657165527s
2026-03-20T23:08:15.625633+00:00 monitor updated for chaiml-1007-tl-ads-run-2-gac_v7
Received healthy response to inference request in 2.3161628246307373s
Received healthy response to inference request in 2.327624559402466s
Received healthy response to inference request in 2.248490810394287s
Received healthy response to inference request in 2.363642930984497s
Received healthy response to inference request in 2.223285675048828s
Received healthy response to inference request in 2.4376630783081055s
Received healthy response to inference request in 2.225961685180664s
Received healthy response to inference request in 2.227318048477173s
Received healthy response to inference request in 2.2451231479644775s
Received healthy response to inference request in 2.233457088470459s
Received healthy response to inference request in 2.32038950920105s
Received healthy response to inference request in 2.2495956420898438s
Received healthy response to inference request in 2.3643007278442383s
Received healthy response to inference request in 2.3265743255615234s
Received healthy response to inference request in 2.2554967403411865s
Received healthy response to inference request in 2.2494044303894043s
Received healthy response to inference request in 2.26223087310791s
Received healthy response to inference request in 2.28913950920105s
Received healthy response to inference request in 2.250086784362793s
Received healthy response to inference request in 2.2294816970825195s
Received healthy response to inference request in 2.33091139793396s
Received healthy response to inference request in 2.2725186347961426s
Received healthy response to inference request in 2.2280502319335938s
Received healthy response to inference request in 2.226530075073242s
Received healthy response to inference request in 2.3625781536102295s
Received healthy response to inference request in 2.328279972076416s
2026-03-20T23:09:15.719429+00:00 monitor updated for chaiml-1007-tl-ads-run-2-gac_v7
Received healthy response to inference request in 2.2470602989196777s
Received healthy response to inference request in 2.3788602352142334s
Received healthy response to inference request in 2.2375035285949707s
30 requests
0 failed requests
5th percentile: 2.2262174606323244
10th percentile: 2.22723925113678
20th percentile: 2.232662010192871
30th percentile: 2.2464791536331177
40th percentile: 2.249519157409668
50th percentile: 2.2588638067245483
60th percentile: 2.2999488353729247
70th percentile: 2.326889395713806
80th percentile: 2.337244749069214
90th percentile: 2.3657566785812376
95th percentile: 2.397527801990509
99th percentile: 2.4304531526565554
mean time: 2.2890174627304076
Pipeline stage StressChecker completed in 71.31s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.64s
Shutdown handler de-registered
chaiml-1007-tl-ads-run-2-gac_v7 status is now deployed due to DeploymentManager action
chaiml-1007-tl-ads-run-2-gac_v7 status is now inactive due to auto deactivation removed underperforming models