Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-98p-2ff-chaiml-m-25525-v2-uploader
Waiting for job on chaiml-98p-2ff-chaiml-m-25525-v2-uploader to finish
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Using quantization_mode: fp8
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Checking if ChaiML/98p_2ff_chaiml_mistral_24b_2048_90555_v2_cp1872_v2_merged-FP8 already exists in ChaiML
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Downloading snapshot of ChaiML/98p_2ff_chaiml_mistral_24b_2048_90555_v2_cp1872_v2_merged...
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Downloaded in 40.602s
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Loading /tmp/model_input...
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: `torch_dtype` is deprecated! Use `dtype` instead!
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Some parameters are on the meta device because they were offloaded to the cpu.
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Applying quantization...
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: 2026-02-19T17:52:45.445317-0800 | reset | INFO - Compression lifecycle reset
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: 2026-02-19T17:52:45.446258-0800 | from_modifiers | INFO - Creating recipe from modifiers
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: 2026-02-19T17:52:45.536035-0800 | initialize | INFO - Compression lifecycle initialized for 1 modifiers
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: 2026-02-19T17:52:45.536309-0800 | IndependentPipeline | INFO - Inferred `DataFreePipeline` for `QuantizationModifier`
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Some parameters are on the meta device because they were offloaded to the cpu.
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: 2026-02-19T17:53:28.306789-0800 | finalize | INFO - Compression lifecycle finalized for 1 modifiers
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: 2026-02-19T17:53:30.557341-0800 | post_process | WARNING - Optimized model is not saved. To save, please provide`output_dir` as input arg.Ex. `oneshot(..., output_dir=...)`
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Saving to /dev/shm/model_output...
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: 2026-02-19T17:53:30.584830-0800 | get_model_compressor | INFO - skip_sparsity_compression_stats set to True. Skipping sparsity compression statistic calculations. No sparsity compressor will be applied.
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Cleaning quantization config in /dev/shm/model_output
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Pushing to ChaiML/98p_2ff_chaiml_mistral_24b_2048_90555_v2_cp1872_v2_merged-FP8
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Checking if ChaiML/98p_2ff_chaiml_mistral_24b_2048_90555_v2_cp1872_v2_merged-FP8 already exists in ChaiML
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Creating repo ChaiML/98p_2ff_chaiml_mistral_24b_2048_90555_v2_cp1872_v2_merged-FP8 and uploading /dev/shm/model_output to it
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: ---------- 2026-02-19 17:54:24 (0:00:00) ----------
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Files: hashed 7/13 (17.4M/27.6G) | pre-uploaded: 0/0 (0.0/27.6G) (+13 unsure) | committed: 0/13 (0.0/27.6G) | ignored: 0
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Workers: hashing: 7 | get upload mode: 4 | pre-uploading: 0 | committing: 0 | waiting: 115
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: ---------------------------------------------------
chaiml-98p-2ff-chaiml-m-25525-v2-uploader:
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: ---------- 2026-02-19 17:55:24 (0:01:00) ----------
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Files: hashed 13/13 (27.6G/27.6G) | pre-uploaded: 7/7 (27.6G/27.6G) | committed: 0/13 (0.0/27.6G) | ignored: 0
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 0 | committing: 1 | waiting: 125
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: ---------------------------------------------------
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Processed model ChaiML/98p_2ff_chaiml_mistral_24b_2048_90555_v2_cp1872_v2_merged in 221.041s
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: creating bucket guanaco-vllm-models
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-25525-v2/default
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-25525-v2/default/model.safetensors.index.json
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-25525-v2/default/special_tokens_map.json
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-25525-v2/default/generation_config.json
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-25525-v2/default/recipe.yaml
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-25525-v2/default/tokenizer_config.json
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-25525-v2/default/config.json
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-25525-v2/default/tokenizer.json
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: cp /dev/shm/model_output/model-00005-of-00006.safetensors s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-25525-v2/default/model-00005-of-00006.safetensors
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: cp /dev/shm/model_output/model-00001-of-00006.safetensors s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-25525-v2/default/model-00001-of-00006.safetensors
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: cp /dev/shm/model_output/model-00004-of-00006.safetensors s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-25525-v2/default/model-00004-of-00006.safetensors
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: cp /dev/shm/model_output/model-00002-of-00006.safetensors s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-25525-v2/default/model-00002-of-00006.safetensors
chaiml-98p-2ff-chaiml-m-25525-v2-uploader: cp /dev/shm/model_output/model-00003-of-00006.safetensors s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-25525-v2/default/model-00003-of-00006.safetensors
Job chaiml-98p-2ff-chaiml-m-25525-v2-uploader completed after 278.14s with status: succeeded
Stopping job with name chaiml-98p-2ff-chaiml-m-25525-v2-uploader
Pipeline stage VLLMUploader completed in 278.71s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.17s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-98p-2ff-chaiml-m-25525-v2
Waiting for inference service chaiml-98p-2ff-chaiml-m-25525-v2 to be ready
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Inference service chaiml-98p-2ff-chaiml-m-25525-v2 ready after 342.50729155540466s
Pipeline stage VLLMDeployer completed in 343.22s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.2240967750549316s
Received healthy response to inference request in 1.15047025680542s
Received healthy response to inference request in 1.187574863433838s
Received healthy response to inference request in 1.2564597129821777s
Received healthy response to inference request in 2.263227701187134s
Received healthy response to inference request in 1.1817569732666016s
Received healthy response to inference request in 1.4277153015136719s
Received healthy response to inference request in 1.5272538661956787s
Received healthy response to inference request in 1.2842235565185547s
Received healthy response to inference request in 1.3777563571929932s
Received healthy response to inference request in 1.1642162799835205s
Received healthy response to inference request in 1.1504290103912354s
Received healthy response to inference request in 1.223132610321045s
Received healthy response to inference request in 1.3106296062469482s
Received healthy response to inference request in 1.456758737564087s
Received healthy response to inference request in 1.2920737266540527s
Received healthy response to inference request in 1.408151388168335s
Received healthy response to inference request in 1.1712257862091064s
Received healthy response to inference request in 1.2115023136138916s
Received healthy response to inference request in 1.1704745292663574s
Received healthy response to inference request in 1.1747725009918213s
Received healthy response to inference request in 1.3412792682647705s
Received healthy response to inference request in 1.159982442855835s
Received healthy response to inference request in 1.3086366653442383s
Received healthy response to inference request in 1.196669101715088s
Received healthy response to inference request in 1.277364730834961s
Received healthy response to inference request in 1.34309720993042s
Received healthy response to inference request in 1.1472420692443848s
Received healthy response to inference request in 1.208381175994873s
Received healthy response to inference request in 1.1538722515106201s
30 requests
0 failed requests
5th percentile: 1.1504475712776183
10th percentile: 1.1535320520401
20th percentile: 1.16922287940979
30th percentile: 1.1796616315841675
40th percentile: 1.203696346282959
50th percentile: 1.2236146926879883
60th percentile: 1.2801082611083985
70th percentile: 1.3092345476150513
80th percentile: 1.3500290393829346
90th percentile: 1.4306196451187134
95th percentile: 1.4955310583114623
99th percentile: 2.0497952890396123
mean time: 1.291680892308553
Pipeline stage StressChecker completed in 44.18s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.71s
Shutdown handler de-registered
chaiml-98p-2ff-chaiml-m_25525_v2 status is now deployed due to DeploymentManager action
chaiml-98p-2ff-chaiml-m_25525_v2 status is now inactive due to auto deactivation removed underperforming models
chaiml-98p-2ff-chaiml-m_25525_v2 status is now torndown due to DeploymentManager action