Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-4d70-fd43-linea-51732-v12-uploader
Waiting for job on chaiml-4d70-fd43-linea-51732-v12-uploader to finish
chaiml-4d70-fd43-linea-51732-v12-uploader: {repo_id} is already quantized
chaiml-4d70-fd43-linea-51732-v12-uploader: Using quantization_mode: none
chaiml-4d70-fd43-linea-51732-v12-uploader: Downloading snapshot of ChaiML/4d70-fd43-linear-w01-FP8...
chaiml-4d70-fd43-linea-51732-v12-uploader: Downloaded in 12.479s
chaiml-4d70-fd43-linea-51732-v12-uploader: Processed model ChaiML/4d70-fd43-linear-w01-FP8 in 17.496s
chaiml-4d70-fd43-linea-51732-v12-uploader: creating bucket guanaco-vllm-models
chaiml-4d70-fd43-linea-51732-v12-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linea-51732-v12-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-4d70-fd43-linea-51732-v12-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-4d70-fd43-linea-51732-v12-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-4d70-fd43-linea-51732-v12-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linea-51732-v12-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-4d70-fd43-linea-51732-v12-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linea-51732-v12-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-4d70-fd43-linea-51732-v12-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linea-51732-v12-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-4d70-fd43-linea-51732-v12-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linea-51732-v12-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-4d70-fd43-linea-51732-v12-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-4d70-fd43-linea-51732-v12-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-4d70-fd43-linea-51732-v12-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-4d70-fd43-linea-51732-v12-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-4d70-fd43-linea-51732-v12-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-4d70-fd43-linea-51732-v12-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-4d70-fd43-linea-51732-v12/default
chaiml-4d70-fd43-linea-51732-v12-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-4d70-fd43-linea-51732-v12/default/.gitattributes
chaiml-4d70-fd43-linea-51732-v12-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linea-51732-v12/default/generation_config.json
chaiml-4d70-fd43-linea-51732-v12-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linea-51732-v12/default/special_tokens_map.json
chaiml-4d70-fd43-linea-51732-v12-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-4d70-fd43-linea-51732-v12/default/recipe.yaml
chaiml-4d70-fd43-linea-51732-v12-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linea-51732-v12/default/model.safetensors.index.json
chaiml-4d70-fd43-linea-51732-v12-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-4d70-fd43-linea-51732-v12/default/README.md
chaiml-4d70-fd43-linea-51732-v12-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linea-51732-v12/default/config.json
chaiml-4d70-fd43-linea-51732-v12-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-4d70-fd43-linea-51732-v12/default/chat_template.jinja
chaiml-4d70-fd43-linea-51732-v12-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linea-51732-v12/default/tokenizer_config.json
chaiml-4d70-fd43-linea-51732-v12-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linea-51732-v12/default/tokenizer.json
chaiml-4d70-fd43-linea-51732-v12-uploader: cp /dev/shm/model_output/model-00003-of-00003.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linea-51732-v12/default/model-00003-of-00003.safetensors
chaiml-4d70-fd43-linea-51732-v12-uploader: cp /dev/shm/model_output/model-00002-of-00003.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linea-51732-v12/default/model-00002-of-00003.safetensors
chaiml-4d70-fd43-linea-51732-v12-uploader: cp /dev/shm/model_output/model-00001-of-00003.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linea-51732-v12/default/model-00001-of-00003.safetensors
Job chaiml-4d70-fd43-linea-51732-v12-uploader completed after 42.09s with status: succeeded
Stopping job with name chaiml-4d70-fd43-linea-51732-v12-uploader
Pipeline stage VLLMUploader completed in 42.56s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.08s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 2.02s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-4d70-fd43-linea-51732-v12
Waiting for inference service chaiml-4d70-fd43-linea-51732-v12 to be ready
2026-03-28T16:10:56.280530+00:00 monitor updated for chaiml-4d70-fd43-linea_51732_v12
Failed to get response for submission chaiml-gspo-glm47-cas72_44260_v1: ('http://chaiml-gspo-glm47-cas72-44260-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
Failed to get response for submission chaiml-glm-47-bobo-v1-s_16089_v2: ('http://chaiml-glm-47-bobo-v1-s-16089-v2-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
2026-03-28T16:11:56.372018+00:00 monitor updated for chaiml-4d70-fd43-linea_51732_v12
2026-03-28T16:12:56.459740+00:00 monitor updated for chaiml-4d70-fd43-linea_51732_v12
Inference service chaiml-4d70-fd43-linea-51732-v12 ready after 170.58676981925964s
Pipeline stage VLLMDeployer completed in 171.12s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.5044076442718506s
Received healthy response to inference request in 3.927212715148926s
Received healthy response to inference request in 1.927926778793335s
Received healthy response to inference request in 3.4614949226379395s
Received healthy response to inference request in 1.846524953842163s
Received healthy response to inference request in 3.5266847610473633s
Received healthy response to inference request in 1.968592882156372s
2026-03-28T16:13:56.551019+00:00 monitor updated for chaiml-4d70-fd43-linea_51732_v12
Received healthy response to inference request in 3.521521806716919s
Received healthy response to inference request in 1.9931280612945557s
Received healthy response to inference request in 1.9233999252319336s
Received healthy response to inference request in 1.834101915359497s
Received healthy response to inference request in 1.840956211090088s
Received healthy response to inference request in 1.869457721710205s
Received healthy response to inference request in 1.9186758995056152s
Received healthy response to inference request in 1.9592390060424805s
Received healthy response to inference request in 2.0249063968658447s
Received healthy response to inference request in 1.9897127151489258s
Received healthy response to inference request in 1.8712527751922607s
Received healthy response to inference request in 1.8335351943969727s
Received healthy response to inference request in 1.7867541313171387s
Received healthy response to inference request in 1.8399391174316406s
Received healthy response to inference request in 2.049837827682495s
Received healthy response to inference request in 1.9772391319274902s
Received healthy response to inference request in 1.8298330307006836s
Received healthy response to inference request in 1.8603026866912842s
Received healthy response to inference request in 2.061964511871338s
Received healthy response to inference request in 1.8473403453826904s
Received healthy response to inference request in 1.9114439487457275s
Received healthy response to inference request in 1.8400723934173584s
Received healthy response to inference request in 1.9168918132781982s
30 requests
0 failed requests
5th percentile: 1.8314990043640136
10th percentile: 1.8340452432632446
20th percentile: 1.840779447555542
30th percentile: 1.856413984298706
40th percentile: 1.895367479324341
50th percentile: 1.9210379123687744
60th percentile: 1.9629805564880372
70th percentile: 1.9907373189926147
80th percentile: 2.0522631645202636
90th percentile: 3.5061190605163572
95th percentile: 3.5243614315986633
99th percentile: 3.8110596084594732
mean time: 2.188811707496643
Pipeline stage StressChecker completed in 68.58s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.93s
Shutdown handler de-registered
chaiml-4d70-fd43-linea_51732_v12 status is now deployed due to DeploymentManager action
chaiml-4d70-fd43-linea_51732_v12 status is now inactive due to auto deactivation removed underperforming models
chaiml-4d70-fd43-linea_51732_v12 status is now torndown due to DeploymentManager action