Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-4d70-fd43-linear-51732-v4-uploader
Waiting for job on chaiml-4d70-fd43-linear-51732-v4-uploader to finish
chaiml-4d70-fd43-linear-51732-v4-uploader: Using quantization_mode: none
chaiml-4d70-fd43-linear-51732-v4-uploader: Downloading snapshot of ChaiML/4d70-fd43-linear-w01-FP8...
chaiml-4d70-fd43-linear-51732-v4-uploader:
Fetching 12 files: 0%| | 0/12 [00:00<?, ?it/s]
Fetching 12 files: 8%|▊ | 1/12 [00:00<00:04, 2.63it/s]
Fetching 12 files: 42%|████▏ | 5/12 [00:09<00:13, 1.92s/it]
Fetching 12 files: 100%|██████████| 12/12 [00:09<00:00, 1.31it/s]
chaiml-4d70-fd43-linear-51732-v4-uploader: Downloaded in 9.265s
chaiml-4d70-fd43-linear-51732-v4-uploader: Processed model ChaiML/4d70-fd43-linear-w01-FP8 in 14.374s
chaiml-4d70-fd43-linear-51732-v4-uploader: creating bucket guanaco-vllm-models
chaiml-4d70-fd43-linear-51732-v4-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v4-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-4d70-fd43-linear-51732-v4-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-4d70-fd43-linear-51732-v4-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-4d70-fd43-linear-51732-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v4-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-4d70-fd43-linear-51732-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v4-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-4d70-fd43-linear-51732-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v4-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-4d70-fd43-linear-51732-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v4-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-4d70-fd43-linear-51732-v4-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-4d70-fd43-linear-51732-v4-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-4d70-fd43-linear-51732-v4-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-4d70-fd43-linear-51732-v4-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-4d70-fd43-linear-51732-v4-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-4d70-fd43-linear-51732-v4-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v4
chaiml-4d70-fd43-linear-51732-v4-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v4/generation_config.json
chaiml-4d70-fd43-linear-51732-v4-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v4/.gitattributes
chaiml-4d70-fd43-linear-51732-v4-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v4/config.json
chaiml-4d70-fd43-linear-51732-v4-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v4/chat_template.jinja
chaiml-4d70-fd43-linear-51732-v4-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v4/special_tokens_map.json
chaiml-4d70-fd43-linear-51732-v4-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v4/recipe.yaml
chaiml-4d70-fd43-linear-51732-v4-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v4/model.safetensors.index.json
chaiml-4d70-fd43-linear-51732-v4-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v4/tokenizer_config.json
chaiml-4d70-fd43-linear-51732-v4-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v4/tokenizer.json
chaiml-4d70-fd43-linear-51732-v4-uploader: cp /dev/shm/model_output/model-00003-of-00003.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v4/model-00003-of-00003.safetensors
chaiml-4d70-fd43-linear-51732-v4-uploader: cp /dev/shm/model_output/model-00002-of-00003.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v4/model-00002-of-00003.safetensors
chaiml-4d70-fd43-linear-51732-v4-uploader: cp /dev/shm/model_output/model-00001-of-00003.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v4/model-00001-of-00003.safetensors
Job chaiml-4d70-fd43-linear-51732-v4-uploader completed after 155.22s with status: succeeded
Stopping job with name chaiml-4d70-fd43-linear-51732-v4-uploader
Pipeline stage VLLMUploader completed in 155.73s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-4d70-fd43-linear-51732-v4
Waiting for inference service chaiml-4d70-fd43-linear-51732-v4 to be ready
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission function_kotit_2026-02-08: HTTPConnectionPool(host='guanaco-model-mesh-load-balancer.model-mesh.k2.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
HTTP Request: %s %s "%s %d %s"
Inference service chaiml-4d70-fd43-linear-51732-v4 ready after 301.7867591381073s
Pipeline stage VLLMDeployer completed in 302.33s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.161088228225708s
Received healthy response to inference request in 1.8319504261016846s
Received healthy response to inference request in 2.304306745529175s
Received healthy response to inference request in 2.0492801666259766s
Received healthy response to inference request in 2.008749485015869s
Received healthy response to inference request in 1.9640636444091797s
Received healthy response to inference request in 2.0808658599853516s
Received healthy response to inference request in 1.8690130710601807s
Received healthy response to inference request in 1.8082900047302246s
Received healthy response to inference request in 1.8844363689422607s
Received healthy response to inference request in 1.9696333408355713s
Received healthy response to inference request in 1.8153386116027832s
Received healthy response to inference request in 1.9502387046813965s
Received healthy response to inference request in 1.8496696949005127s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 2.3169612884521484s
Received healthy response to inference request in 2.0006678104400635s
Received healthy response to inference request in 1.8963239192962646s
Received healthy response to inference request in 1.9821724891662598s
Received healthy response to inference request in 1.8801589012145996s
Received healthy response to inference request in 2.1912670135498047s
Received healthy response to inference request in 2.2690155506134033s
Received healthy response to inference request in 2.786241054534912s
Received healthy response to inference request in 2.0610427856445312s
Received healthy response to inference request in 2.060037136077881s
Received healthy response to inference request in 1.9779603481292725s
Received healthy response to inference request in 1.9013452529907227s
Received healthy response to inference request in 1.9056127071380615s
Received healthy response to inference request in 2.355018377304077s
Received healthy response to inference request in 2.025453805923462s
Received healthy response to inference request in 1.9167749881744385s
30 requests
0 failed requests
5th percentile: 1.8228139281272888
10th percentile: 1.8478977680206299
20th percentile: 1.8835808753967285
30th percentile: 1.9043324708938598
40th percentile: 1.9585336685180663
50th percentile: 1.9800664186477661
60th percentile: 2.015431213378906
70th percentile: 2.060338830947876
80th percentile: 2.1671239852905275
90th percentile: 2.305572199821472
95th percentile: 2.3378926873207093
99th percentile: 2.6611864781379704
mean time: 2.0357659260431924
Pipeline stage StressChecker completed in 64.07s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.32s
Shutdown handler de-registered
chaiml-4d70-fd43-linear_51732_v4 status is now deployed due to DeploymentManager action
chaiml-4d70-fd43-linear_51732_v4 status is now inactive due to auto deactivation removed underperforming models
chaiml-4d70-fd43-linear_51732_v4 status is now torndown due to DeploymentManager action