Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-ca18-c13f-linea-65808-v13-uploader
Waiting for job on chaiml-ca18-c13f-linea-65808-v13-uploader to finish
chaiml-ca18-c13f-linea-65808-v13-uploader: Using quantization_mode: fp8
chaiml-ca18-c13f-linea-65808-v13-uploader: Repo ChaiML/ca18-c13f-linear-w01-FP8 already ends in FP8. Skipping...
chaiml-ca18-c13f-linea-65808-v13-uploader: Checking if ChaiML/ca18-c13f-linear-w01-FP8 already exists in ChaiML
chaiml-ca18-c13f-linea-65808-v13-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-ca18-c13f-linea-65808-v13-uploader: Downloading snapshot of ChaiML/ca18-c13f-linear-w01-FP8...
chaiml-ca18-c13f-linea-65808-v13-uploader: Downloaded in 12.232s
chaiml-ca18-c13f-linea-65808-v13-uploader: Processed model ChaiML/ca18-c13f-linear-w01-FP8 in 15.666s
chaiml-ca18-c13f-linea-65808-v13-uploader: creating bucket guanaco-vllm-models
chaiml-ca18-c13f-linea-65808-v13-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linea-65808-v13-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-ca18-c13f-linea-65808-v13-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-ca18-c13f-linea-65808-v13-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-ca18-c13f-linea-65808-v13-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linea-65808-v13-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-ca18-c13f-linea-65808-v13-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linea-65808-v13-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-ca18-c13f-linea-65808-v13-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linea-65808-v13-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-ca18-c13f-linea-65808-v13-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linea-65808-v13-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-ca18-c13f-linea-65808-v13-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-ca18-c13f-linea-65808-v13-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-ca18-c13f-linea-65808-v13-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-ca18-c13f-linea-65808-v13-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-ca18-c13f-linea-65808-v13-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-ca18-c13f-linea-65808-v13-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v13/default
chaiml-ca18-c13f-linea-65808-v13-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v13/default/.gitattributes
chaiml-ca18-c13f-linea-65808-v13-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v13/default/chat_template.jinja
chaiml-ca18-c13f-linea-65808-v13-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v13/default/README.md
chaiml-ca18-c13f-linea-65808-v13-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v13/default/generation_config.json
chaiml-ca18-c13f-linea-65808-v13-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v13/default/config.json
chaiml-ca18-c13f-linea-65808-v13-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v13/default/special_tokens_map.json
chaiml-ca18-c13f-linea-65808-v13-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v13/default/recipe.yaml
chaiml-ca18-c13f-linea-65808-v13-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v13/default/model.safetensors.index.json
chaiml-ca18-c13f-linea-65808-v13-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v13/default/tokenizer_config.json
chaiml-ca18-c13f-linea-65808-v13-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v13/default/tokenizer.json
chaiml-ca18-c13f-linea-65808-v13-uploader: cp /dev/shm/model_output/model-00003-of-00003.safetensors s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v13/default/model-00003-of-00003.safetensors
chaiml-ca18-c13f-linea-65808-v13-uploader: cp /dev/shm/model_output/model-00001-of-00003.safetensors s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v13/default/model-00001-of-00003.safetensors
chaiml-ca18-c13f-linea-65808-v13-uploader: cp /dev/shm/model_output/model-00002-of-00003.safetensors s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v13/default/model-00002-of-00003.safetensors
Job chaiml-ca18-c13f-linea-65808-v13-uploader completed after 42.22s with status: succeeded
Stopping job with name chaiml-ca18-c13f-linea-65808-v13-uploader
Pipeline stage VLLMUploader completed in 42.68s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.74s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-ca18-c13f-linea-65808-v13
Waiting for inference service chaiml-ca18-c13f-linea-65808-v13 to be ready
2026-03-24T22:16:55.103027+00:00 monitor updated for chaiml-ca18-c13f-linea_65808_v13
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-ca18-c13f-linea-65808-v14-uploader
Waiting for job on chaiml-ca18-c13f-linea-65808-v14-uploader to finish
2026-03-24T22:17:55.201435+00:00 monitor updated for chaiml-ca18-c13f-linea_65808_v13
chaiml-ca18-c13f-linea-65808-v14-uploader: Using quantization_mode: fp8
chaiml-ca18-c13f-linea-65808-v14-uploader: Repo ChaiML/ca18-c13f-linear-w01-FP8 already ends in FP8. Skipping...
chaiml-ca18-c13f-linea-65808-v14-uploader: Checking if ChaiML/ca18-c13f-linear-w01-FP8 already exists in ChaiML
chaiml-ca18-c13f-linea-65808-v14-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-ca18-c13f-linea-65808-v14-uploader: Downloading snapshot of ChaiML/ca18-c13f-linear-w01-FP8...
chaiml-ca18-c13f-linea-65808-v14-uploader: Downloaded in 8.638s
chaiml-ca18-c13f-linea-65808-v14-uploader: Processed model ChaiML/ca18-c13f-linear-w01-FP8 in 12.168s
2026-03-24T22:18:17.806110+00:00 monitor updated for chaiml-ca18-c13f-linea_65808_v14
chaiml-ca18-c13f-linea-65808-v14-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v14/default/tokenizer.json
chaiml-ca18-c13f-linea-65808-v14-uploader: cp /dev/shm/model_output/model-00003-of-00003.safetensors s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v14/default/model-00003-of-00003.safetensors
chaiml-ca18-c13f-linea-65808-v14-uploader: cp /dev/shm/model_output/model-00001-of-00003.safetensors s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v14/default/model-00001-of-00003.safetensors
chaiml-ca18-c13f-linea-65808-v14-uploader: cp /dev/shm/model_output/model-00002-of-00003.safetensors s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v14/default/model-00002-of-00003.safetensors
Job chaiml-ca18-c13f-linea-65808-v14-uploader completed after 71.86s with status: succeeded
Stopping job with name chaiml-ca18-c13f-linea-65808-v14-uploader
Pipeline stage VLLMUploader completed in 72.61s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.40s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-ca18-c13f-linea-65808-v14
Waiting for inference service chaiml-ca18-c13f-linea-65808-v14 to be ready
2026-03-24T22:18:55.344674+00:00 monitor updated for chaiml-ca18-c13f-linea_65808_v13
2026-03-24T22:19:17.937648+00:00 monitor updated for chaiml-ca18-c13f-linea_65808_v14
Inference service chaiml-ca18-c13f-linea-65808-v13 ready after 160.4647274017334s
Pipeline stage VLLMDeployer completed in 161.05s
run pipeline stage %s
Running pipeline stage StressChecker
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Received healthy response to inference request in 6.7891621589660645s
Starting job with name chaiml-ca18-c13f-linea-65808-v15-uploader
Waiting for job on chaiml-ca18-c13f-linea-65808-v15-uploader to finish
Received healthy response to inference request in 6.687978506088257s
Received healthy response to inference request in 1.6805343627929688s
Received healthy response to inference request in 1.6832871437072754s
Received healthy response to inference request in 2.0921335220336914s
Received healthy response to inference request in 6.877558469772339s
Received healthy response to inference request in 6.689295768737793s
2026-03-24T22:19:55.507087+00:00 monitor updated for chaiml-ca18-c13f-linea_65808_v13
Received healthy response to inference request in 1.7398872375488281s
Received healthy response to inference request in 6.826912879943848s
Received healthy response to inference request in 1.6679656505584717s
Received healthy response to inference request in 1.6692829132080078s
Received healthy response to inference request in 2.015584707260132s
Received healthy response to inference request in 1.7354371547698975s
Received healthy response to inference request in 1.6668446063995361s
Received healthy response to inference request in 1.6695609092712402s
Received healthy response to inference request in 1.8819935321807861s
2026-03-24T22:20:18.081030+00:00 monitor updated for chaiml-ca18-c13f-linea_65808_v14
Received healthy response to inference request in 1.9306230545043945s
chaiml-ca18-c13f-linea-65808-v15-uploader: Using quantization_mode: fp8
Received healthy response to inference request in 1.6504061222076416s
Received healthy response to inference request in 1.7413370609283447s
Received healthy response to inference request in 1.7758409976959229s
Received healthy response to inference request in 1.699789047241211s
2026-03-24T22:20:27.483857+00:00 monitor updated for chaiml-ca18-c13f-linea_65808_v15
Received healthy response to inference request in 1.8164491653442383s
chaiml-ca18-c13f-linea-65808-v15-uploader: Repo ChaiML/ca18-c13f-linear-w01-FP8 already ends in FP8. Skipping...
chaiml-ca18-c13f-linea-65808-v15-uploader: Checking if ChaiML/ca18-c13f-linear-w01-FP8 already exists in ChaiML
chaiml-ca18-c13f-linea-65808-v15-uploader: Model already exists. Downloading to /dev/shm/model_output...
Received healthy response to inference request in 1.8130121231079102s
chaiml-ca18-c13f-linea-65808-v15-uploader: Downloading snapshot of ChaiML/ca18-c13f-linear-w01-FP8...
chaiml-ca18-c13f-linea-65808-v15-uploader: Downloaded in 6.403s
chaiml-ca18-c13f-linea-65808-v15-uploader: Processed model ChaiML/ca18-c13f-linear-w01-FP8 in 9.835s
chaiml-ca18-c13f-linea-65808-v15-uploader: creating bucket guanaco-vllm-models
chaiml-ca18-c13f-linea-65808-v15-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linea-65808-v15-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-ca18-c13f-linea-65808-v15-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-ca18-c13f-linea-65808-v15-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-ca18-c13f-linea-65808-v15-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linea-65808-v15-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-ca18-c13f-linea-65808-v15-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
Received healthy response to inference request in 1.7033820152282715s
chaiml-ca18-c13f-linea-65808-v15-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-ca18-c13f-linea-65808-v15-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linea-65808-v15-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-ca18-c13f-linea-65808-v15-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linea-65808-v15-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-ca18-c13f-linea-65808-v15-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-ca18-c13f-linea-65808-v15-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-ca18-c13f-linea-65808-v15-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-ca18-c13f-linea-65808-v15-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-ca18-c13f-linea-65808-v15-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-ca18-c13f-linea-65808-v15-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v15/default
Received healthy response to inference request in 1.6764934062957764s
chaiml-ca18-c13f-linea-65808-v15-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v15/default/README.md
chaiml-ca18-c13f-linea-65808-v15-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v15/default/recipe.yaml
chaiml-ca18-c13f-linea-65808-v15-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v15/default/tokenizer_config.json
chaiml-ca18-c13f-linea-65808-v15-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v15/default/generation_config.json
chaiml-ca18-c13f-linea-65808-v15-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v15/default/model.safetensors.index.json
chaiml-ca18-c13f-linea-65808-v15-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v15/default/special_tokens_map.json
chaiml-ca18-c13f-linea-65808-v15-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v15/default/chat_template.jinja
chaiml-ca18-c13f-linea-65808-v15-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v15/default/config.json
chaiml-ca18-c13f-linea-65808-v15-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v15/default/.gitattributes
chaiml-ca18-c13f-linea-65808-v15-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linea-65808-v15/default/tokenizer.json
Received healthy response to inference request in 1.898702621459961s
Received healthy response to inference request in 1.6687097549438477s
Received healthy response to inference request in 1.6910984516143799s
Received healthy response to inference request in 1.6694676876068115s
Received healthy response to inference request in 1.7335939407348633s
30 requests
0 failed requests
5th percentile: 1.6673490762710572
10th percentile: 1.6686353445053101
20th percentile: 1.6695422649383544
30th percentile: 1.6824613094329834
40th percentile: 1.7019448280334473
50th percentile: 1.7376621961593628
60th percentile: 1.7907094478607177
70th percentile: 1.8870062589645384
80th percentile: 2.030894470214844
90th percentile: 6.69928240776062
95th percentile: 6.809925055503845
Job chaiml-ca18-c13f-linea-65808-v15-uploader completed after 77.12s with status: succeeded
99th percentile: 6.862871248722077
Stopping job with name chaiml-ca18-c13f-linea-65808-v15-uploader
mean time: 2.594744165738424
Pipeline stage VLLMUploader completed in 78.49s
Pipeline stage StressChecker completed in 85.38s
run pipeline stage %s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
Pipeline stage VLLMTemplater completed in 0.55s
run pipeline stage %s
Running pipeline stage VLLMDeployer
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.26s
Creating inference service chaiml-ca18-c13f-linea-65808-v15
Shutdown handler de-registered
chaiml-ca18-c13f-linea_65808_v13 status is now deployed due to DeploymentManager action
Waiting for inference service chaiml-ca18-c13f-linea-65808-v15 to be ready
chaiml-ca18-c13f-linea_65808_v13 status is now inactive due to auto deactivation removed underperforming models
chaiml-ca18-c13f-linea_65808_v13 status is now torndown due to DeploymentManager action