Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v3-q27b-lr5-22882-v8-uploader
Waiting for job on chaiml-pony-v3-q27b-lr5-22882-v8-uploader to finish
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: Using quantization_mode: fp8
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: Checking if ChaiML/pony-v3-q27b-lr5e6ep1g8-FP8 already exists in ChaiML
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: Downloading snapshot of ChaiML/pony-v3-q27b-lr5e6ep1g8-FP8...
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: Downloaded in 37.019s
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: Processed model ChaiML/pony-v3-q27b-lr5e6ep1g8 in 39.973s
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v8/default
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v8/default/tokenizer_config.json
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v8/default/chat_template.jinja
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v8/default/.gitattributes
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v8/default/config.json
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v8/default/recipe.yaml
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v8/default/generation_config.json
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v8/default/tokenizer.json
2026-03-31T05:00:27.150628+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v8
chaiml-pony-v3-q27b-lr5-22882-v8-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v8/default/model.safetensors
2026-03-31T05:01:27.411041+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v8
Job chaiml-pony-v3-q27b-lr5-22882-v8-uploader completed after 125.55s with status: succeeded
Stopping job with name chaiml-pony-v3-q27b-lr5-22882-v8-uploader
Pipeline stage VLLMUploader completed in 126.50s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.36s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v3-q27b-lr5-22882-v8
Waiting for inference service chaiml-pony-v3-q27b-lr5-22882-v8 to be ready
2026-03-31T05:02:27.571758+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v8
2026-03-31T05:03:27.743971+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v8
2026-03-31T05:04:27.955893+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v8
Inference service chaiml-pony-v3-q27b-lr5-22882-v8 ready after 181.8572907447815s
Pipeline stage VLLMDeployer completed in 182.84s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-31T05:05:28.106068+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v8
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.556572914123535s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.8409228324890137s
2026-03-31T05:06:30.398524+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v8
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 6.079694509506226s
Received healthy response to inference request in 3.7688252925872803s
Received healthy response to inference request in 1.8765232563018799s
Received healthy response to inference request in 2.2780568599700928s
Received healthy response to inference request in 2.0829482078552246s
Received healthy response to inference request in 1.9303343296051025s
Received healthy response to inference request in 1.9394314289093018s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.465165376663208s
Received healthy response to inference request in 2.071739912033081s
Received healthy response to inference request in 1.9120779037475586s
Received healthy response to inference request in 2.044374942779541s
Received healthy response to inference request in 1.981801986694336s
2026-03-31T05:07:30.514964+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v8
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.157146453857422s
Received healthy response to inference request in 18.722506284713745s
Received healthy response to inference request in 2.298553705215454s
Received healthy response to inference request in 2.0074939727783203s
Received healthy response to inference request in 2.0654947757720947s
Received healthy response to inference request in 2.0354740619659424s
Received healthy response to inference request in 1.997694730758667s
Received healthy response to inference request in 2.212282419204712s
Received healthy response to inference request in 2.113781452178955s
30 requests
7 failed requests
5th percentile: 1.8925228476524354
10th percentile: 1.9285086870193482
20th percentile: 1.9945161819458008
30th percentile: 2.0417046785354613
40th percentile: 2.078464889526367
50th percentile: 2.184714436531067
60th percentile: 2.365198373794555
70th percentile: 5.013509392738338
80th percentile: 20.125200510025024
90th percentile: 20.23435173034668
95th percentile: 20.390446841716766
99th percentile: 27.92771390438081
mean time: 7.494718837738037
%s, retrying in %s seconds...
Received healthy response to inference request in 1.7464401721954346s
Received healthy response to inference request in 2.1616568565368652s
2026-03-31T05:08:30.731426+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v8
Received healthy response to inference request in 1.850111961364746s
Received healthy response to inference request in 1.779249906539917s
Received healthy response to inference request in 1.7905418872833252s
Received healthy response to inference request in 1.8808224201202393s
Received healthy response to inference request in 2.2826099395751953s
Received healthy response to inference request in 1.8765316009521484s
Received healthy response to inference request in 2.6007063388824463s
Received healthy response to inference request in 2.4064838886260986s
Received healthy response to inference request in 1.8355042934417725s
Received healthy response to inference request in 2.121675491333008s
Received healthy response to inference request in 1.9122178554534912s
Received healthy response to inference request in 1.8621816635131836s
Received healthy response to inference request in 2.0184214115142822s
Received healthy response to inference request in 2.0293312072753906s
Received healthy response to inference request in 1.9296555519104004s
Received healthy response to inference request in 1.8238167762756348s
Received healthy response to inference request in 2.015235185623169s
Received healthy response to inference request in 1.9522852897644043s
Received healthy response to inference request in 2.1491949558258057s
Received healthy response to inference request in 2.223510503768921s
Received healthy response to inference request in 2.0601515769958496s
Received healthy response to inference request in 1.8867945671081543s
Received healthy response to inference request in 2.1923770904541016s
Received healthy response to inference request in 2.0033981800079346s
Received healthy response to inference request in 1.9900197982788086s
Received healthy response to inference request in 2.0010764598846436s
Received healthy response to inference request in 2.23833966255188s
Received healthy response to inference request in 2.203970193862915s
30 requests
0 failed requests
5th percentile: 1.7843312978744508
10th percentile: 1.8204892873764038
20th percentile: 1.8597677230834961
30th percentile: 1.8850029230117797
40th percentile: 1.9432333946228029
50th percentile: 2.002237319946289
60th percentile: 2.0227853298187255
70th percentile: 2.129931330680847
80th percentile: 2.1946957111358643
90th percentile: 2.2427666902542116
95th percentile: 2.3507406115531917
99th percentile: 2.5443818283081057
mean time: 2.0274770895640057
Pipeline stage StressChecker completed in 291.62s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
2026-03-31T05:09:30.846818+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v8
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.79s
Shutdown handler de-registered
chaiml-pony-v3-q27b-lr5_22882_v8 status is now deployed due to DeploymentManager action
chaiml-pony-v3-q27b-lr5_22882_v8 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-v3-q27b-lr5_22882_v8 status is now torndown due to DeploymentManager action