Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v2-q27b-lr1-74562-v9-uploader
Waiting for job on chaiml-pony-v2-q27b-lr1-74562-v9-uploader to finish
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: Using quantization_mode: none
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: Downloading snapshot of ChaiML/pony-v2-q27b-lr1e4ep1r64g8...
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: Downloaded in 36.887s
2026-03-18T16:09:25.291156+00:00 monitor updated for chaiml-pony-v2-q27b-lr1_74562_v9
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: Processed model ChaiML/pony-v2-q27b-lr1e4ep1r64g8 in 57.318s
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/training_args.bin s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/training_args.bin
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/config.json
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/special_tokens_map.json
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/.gitattributes
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/README.md
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/added_tokens.json
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/args.json s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/args.json
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/vocab.json
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/merges.txt
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/chat_template.jinja
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/preprocessor_config.json s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/preprocessor_config.json
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/generation_config.json
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/tokenizer_config.json
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/trainer_state.json s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/trainer_state.json
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/model.safetensors.index.json
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/processor_config.json s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/processor_config.json
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/tokenizer.json
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/model-00002-of-00002.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/model-00002-of-00002.safetensors
2026-03-18T16:10:25.380924+00:00 monitor updated for chaiml-pony-v2-q27b-lr1_74562_v9
2026-03-18T16:11:25.487358+00:00 monitor updated for chaiml-pony-v2-q27b-lr1_74562_v9
2026-03-18T16:12:25.644360+00:00 monitor updated for chaiml-pony-v2-q27b-lr1_74562_v9
2026-03-18T16:13:25.740439+00:00 monitor updated for chaiml-pony-v2-q27b-lr1_74562_v9
2026-03-18T16:14:25.846227+00:00 monitor updated for chaiml-pony-v2-q27b-lr1_74562_v9
chaiml-pony-v2-q27b-lr1-74562-v9-uploader: cp /dev/shm/model_output/model-00001-of-00002.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-q27b-lr1-74562-v9/default/model-00001-of-00002.safetensors
Job chaiml-pony-v2-q27b-lr1-74562-v9-uploader completed after 418.87s with status: succeeded
Stopping job with name chaiml-pony-v2-q27b-lr1-74562-v9-uploader
Pipeline stage VLLMUploader completed in 419.42s
run pipeline stage %s
Running pipeline stage VLLMTemplater
2026-03-18T16:15:25.972929+00:00 monitor updated for chaiml-pony-v2-q27b-lr1_74562_v9
Pipeline stage VLLMTemplater completed in 2.15s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v2-q27b-lr1-74562-v9
Waiting for inference service chaiml-pony-v2-q27b-lr1-74562-v9 to be ready
2026-03-18T16:16:26.075437+00:00 monitor updated for chaiml-pony-v2-q27b-lr1_74562_v9
2026-03-18T16:17:26.173920+00:00 monitor updated for chaiml-pony-v2-q27b-lr1_74562_v9
2026-03-18T16:18:26.269223+00:00 monitor updated for chaiml-pony-v2-q27b-lr1_74562_v9
Inference service chaiml-pony-v2-q27b-lr1-74562-v9 ready after 200.6710274219513s
Pipeline stage VLLMDeployer completed in 201.18s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-18T16:19:26.367165+00:00 monitor updated for chaiml-pony-v2-q27b-lr1_74562_v9
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.361525058746338s
Received healthy response to inference request in 2.4122281074523926s
Received healthy response to inference request in 8.994903326034546s
2026-03-18T16:20:26.470038+00:00 monitor updated for chaiml-pony-v2-q27b-lr1_74562_v9
Received healthy response to inference request in 2.711092710494995s
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.3831419944763184s
Received healthy response to inference request in 2.1689789295196533s
Received healthy response to inference request in 8.273313999176025s
Received healthy response to inference request in 2.664292573928833s
Received healthy response to inference request in 2.3125503063201904s
Received healthy response to inference request in 2.3893027305603027s
Received healthy response to inference request in 2.2278404235839844s
Received healthy response to inference request in 2.260765314102173s
Received healthy response to inference request in 2.4140751361846924s
Received healthy response to inference request in 2.3364787101745605s
Received healthy response to inference request in 2.2824575901031494s
Received healthy response to inference request in 4.432740211486816s
2026-03-18T16:21:26.577600+00:00 monitor updated for chaiml-pony-v2-q27b-lr1_74562_v9
Received healthy response to inference request in 15.077169179916382s
Received healthy response to inference request in 2.4543166160583496s
Received healthy response to inference request in 2.6361215114593506s
Received healthy response to inference request in 2.336805820465088s
Received healthy response to inference request in 2.5228893756866455s
Received healthy response to inference request in 2.4647700786590576s
Received healthy response to inference request in 2.3465800285339355s
Received healthy response to inference request in 2.3768694400787354s
Received healthy response to inference request in 2.536310911178589s
30 requests
5 failed requests
5th percentile: 2.2426566243171693
10th percentile: 2.2802883625030517
20th percentile: 2.3367403984069823
30th percentile: 2.3812602281570436
40th percentile: 2.4133363246917723
50th percentile: 2.4938297271728516
60th percentile: 2.6473899364471434
70th percentile: 4.382889604568481
80th percentile: 10.21135649681093
90th percentile: 20.124230217933654
95th percentile: 20.13211408853531
99th percentile: 20.358243100643158
mean time: 6.344247682889303
%s, retrying in %s seconds...
Received healthy response to inference request in 2.160097360610962s
Received healthy response to inference request in 2.251086711883545s
Received healthy response to inference request in 2.1758460998535156s
Received healthy response to inference request in 2.2518999576568604s
Received healthy response to inference request in 2.2759177684783936s
Received healthy response to inference request in 2.1884357929229736s
Received healthy response to inference request in 2.6089260578155518s
Received healthy response to inference request in 2.190896511077881s
Received healthy response to inference request in 2.2382349967956543s
Received healthy response to inference request in 2.221909284591675s
Received healthy response to inference request in 2.364959716796875s
2026-03-18T16:22:27.113392+00:00 monitor updated for chaiml-pony-v2-q27b-lr1_74562_v9
Received healthy response to inference request in 2.2545278072357178s
Received healthy response to inference request in 2.267869234085083s
Received healthy response to inference request in 2.3169984817504883s
Received healthy response to inference request in 2.2535760402679443s
Received healthy response to inference request in 2.2456696033477783s
Received healthy response to inference request in 2.23112416267395s
Received healthy response to inference request in 2.401076078414917s
Received healthy response to inference request in 2.335937976837158s
Received healthy response to inference request in 2.2435450553894043s
Received healthy response to inference request in 2.2841196060180664s
Received healthy response to inference request in 2.4201090335845947s
Received healthy response to inference request in 2.358952283859253s
Received healthy response to inference request in 2.326260805130005s
Received healthy response to inference request in 2.6889586448669434s
Received healthy response to inference request in 2.6050992012023926s
Received healthy response to inference request in 2.31626033782959s
Received healthy response to inference request in 2.408428192138672s
Received healthy response to inference request in 2.3282458782196045s
Received healthy response to inference request in 2.337534189224243s
30 requests
0 failed requests
5th percentile: 2.1815114617347717
10th percentile: 2.1906504392623902
20th percentile: 2.2368128299713135
30th percentile: 2.2494615793228148
40th percentile: 2.254147100448608
50th percentile: 2.28001868724823
60th percentile: 2.320703411102295
70th percentile: 2.3364168405532837
80th percentile: 2.3721829891204833
90th percentile: 2.438608050346375
95th percentile: 2.6072039723396303
99th percentile: 2.6657491946220397
mean time: 2.3184167623519896
Pipeline stage StressChecker completed in 265.57s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.84s
Shutdown handler de-registered
chaiml-pony-v2-q27b-lr1_74562_v9 status is now deployed due to DeploymentManager action