Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3b-mv1-top2-9386-v8-uploader
Waiting for job on chaiml-pony-d3b-mv1-top2-9386-v8-uploader to finish
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: Using quantization_mode: fp8
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: Checking if ChaiML/pony-d3b-mv1-top2-q35b-lr5e6ep2g8-FP8 already exists in ChaiML
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: Downloading snapshot of ChaiML/pony-d3b-mv1-top2-q35b-lr5e6ep2g8-FP8...
2026-03-28T17:15:16.856349+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v8
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: Downloaded in 35.960s
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: Processed model ChaiML/pony-d3b-mv1-top2-q35b-lr5e6ep2g8 in 38.741s
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v8/default
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v8/default/config.json
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v8/default/generation_config.json
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v8/default/recipe.yaml
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v8/default/.gitattributes
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v8/default/chat_template.jinja
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v8/default/tokenizer_config.json
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v8/default/tokenizer.json
2026-03-28T17:16:16.962665+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v8
chaiml-pony-d3b-mv1-top2-9386-v8-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v8/default/model.safetensors
Job chaiml-pony-d3b-mv1-top2-9386-v8-uploader completed after 153.45s with status: succeeded
Stopping job with name chaiml-pony-d3b-mv1-top2-9386-v8-uploader
Pipeline stage VLLMUploader completed in 153.92s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.09s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 2.68s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3b-mv1-top2-9386-v8
Waiting for inference service chaiml-pony-d3b-mv1-top2-9386-v8 to be ready
2026-03-28T17:17:17.054808+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v8
2026-03-28T17:18:17.155121+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v8
2026-03-28T17:19:17.250756+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v8
2026-03-28T17:20:17.347362+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v8
Unable to record family friendly update due to error: Invalid JSON input: Expecting value: line 1 column 1 (char 0)
Inference service chaiml-pony-d3b-mv1-top2-9386-v8 ready after 230.49020338058472s
Pipeline stage VLLMDeployer completed in 230.91s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T17:21:17.442413+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v8
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 5.24032187461853s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T17:22:17.560119+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v8
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 6.006056547164917s
Received healthy response to inference request in 5.87287163734436s
Received healthy response to inference request in 2.599802017211914s
Received healthy response to inference request in 2.5553090572357178s
Received healthy response to inference request in 2.5738143920898438s
Received healthy response to inference request in 2.3044910430908203s
Received healthy response to inference request in 2.1409213542938232s
2026-03-28T17:23:17.711323+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v8
Received healthy response to inference request in 9.467904806137085s
Received healthy response to inference request in 2.090278387069702s
Received healthy response to inference request in 2.1141371726989746s
Received healthy response to inference request in 2.773071527481079s
Received healthy response to inference request in 2.1808671951293945s
Received healthy response to inference request in 2.131704807281494s
Received healthy response to inference request in 2.5496323108673096s
Received healthy response to inference request in 2.149916648864746s
Received healthy response to inference request in 2.105285882949829s
Received healthy response to inference request in 2.4698567390441895s
Received healthy response to inference request in 2.125053644180298s
Received healthy response to inference request in 2.5307979583740234s
Received healthy response to inference request in 2.0943238735198975s
Received healthy response to inference request in 2.1928207874298096s
Received healthy response to inference request in 2.122260332107544s
Received healthy response to inference request in 2.1265063285827637s
30 requests
6 failed requests
5th percentile: 2.0992567777633666
10th percentile: 2.11325204372406
20th percentile: 2.1262157917022706
30th percentile: 2.1472180604934694
40th percentile: 2.259822940826416
50th percentile: 2.5402151346206665
60th percentile: 2.584209442138672
70th percentile: 5.430086803436278
80th percentile: 11.59584960937503
90th percentile: 20.149348974227905
95th percentile: 20.169587278366087
99th percentile: 20.366204607486726
mean time: 6.4561194260915125
%s, retrying in %s seconds...
Received healthy response to inference request in 2.032684326171875s
Received healthy response to inference request in 2.137258529663086s
Received healthy response to inference request in 2.0809261798858643s
Received healthy response to inference request in 2.0726170539855957s
Received healthy response to inference request in 2.0915491580963135s
Received healthy response to inference request in 2.1291654109954834s
Received healthy response to inference request in 2.132983684539795s
Received healthy response to inference request in 2.1099166870117188s
2026-03-28T17:24:18.016466+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v8
Received healthy response to inference request in 2.0887792110443115s
Received healthy response to inference request in 2.1089932918548584s
Received healthy response to inference request in 2.083789587020874s
Received healthy response to inference request in 2.1263041496276855s
Received healthy response to inference request in 2.1018800735473633s
Received healthy response to inference request in 2.1497316360473633s
Received healthy response to inference request in 2.1310067176818848s
Received healthy response to inference request in 2.100039005279541s
Received healthy response to inference request in 2.082974672317505s
Received healthy response to inference request in 2.0684547424316406s
Received healthy response to inference request in 2.3509740829467773s
Received healthy response to inference request in 2.2428812980651855s
Received healthy response to inference request in 2.112427234649658s
Received healthy response to inference request in 2.067904472351074s
Received healthy response to inference request in 2.086527109146118s
Received healthy response to inference request in 2.5785670280456543s
Received healthy response to inference request in 2.214662551879883s
Received healthy response to inference request in 2.1059505939483643s
Received healthy response to inference request in 2.125525951385498s
Received healthy response to inference request in 2.3487281799316406s
Received healthy response to inference request in 2.2600252628326416s
Received healthy response to inference request in 2.1330673694610596s
30 requests
0 failed requests
5th percentile: 2.068152093887329
10th percentile: 2.0722008228302
20th percentile: 2.0836266040802003
30th percentile: 2.090718173980713
40th percentile: 2.104322385787964
50th percentile: 2.1111719608306885
60th percentile: 2.1274486541748048
70th percentile: 2.1330087900161745
80th percentile: 2.162717819213867
90th percentile: 2.2688955545425418
95th percentile: 2.349963426589966
99th percentile: 2.51256507396698
mean time: 2.1485431750615436
Pipeline stage StressChecker completed in 263.61s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.32s
Shutdown handler de-registered
chaiml-pony-d3b-mv1-top2_9386_v8 status is now deployed due to DeploymentManager action