Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-dragon-hunter-ar-26059-v1-uploader
Waiting for job on chaiml-dragon-hunter-ar-26059-v1-uploader to finish
chaiml-dragon-hunter-ar-26059-v1-uploader: Using quantization_mode: fp8
chaiml-dragon-hunter-ar-26059-v1-uploader: Downloaded in 83.539s
chaiml-dragon-hunter-ar-26059-v1-uploader: Loading /tmp/model_input...
chaiml-dragon-hunter-ar-26059-v1-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
chaiml-dragon-hunter-ar-26059-v1-uploader: `torch_dtype` is deprecated! Use `dtype` instead!
chaiml-dragon-hunter-ar-26059-v1-uploader: Some parameters are on the meta device because they were offloaded to the cpu.
chaiml-dragon-hunter-ar-26059-v1-uploader: Applying quantization...
chaiml-dragon-hunter-ar-26059-v1-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
chaiml-dragon-hunter-ar-26059-v1-uploader: 2026-03-09T18:22:20.576743-0700 | reset | INFO - Compression lifecycle reset
chaiml-dragon-hunter-ar-26059-v1-uploader: 2026-03-09T18:22:20.577815-0700 | from_modifiers | INFO - Creating recipe from modifiers
chaiml-dragon-hunter-ar-26059-v1-uploader: 2026-03-09T18:22:20.651796-0700 | initialize | INFO - Compression lifecycle initialized for 1 modifiers
chaiml-dragon-hunter-ar-26059-v1-uploader: 2026-03-09T18:22:20.652077-0700 | IndependentPipeline | INFO - Inferred `DataFreePipeline` for `QuantizationModifier`
chaiml-dragon-hunter-ar-26059-v1-uploader: Some parameters are on the meta device because they were offloaded to the cpu.
chaiml-dragon-hunter-ar-26059-v1-uploader: 2026-03-09T18:22:49.850172-0700 | finalize | INFO - Compression lifecycle finalized for 1 modifiers
chaiml-dragon-hunter-ar-26059-v1-uploader: 2026-03-09T18:22:51.927620-0700 | post_process | WARNING - Optimized model is not saved. To save, please provide`output_dir` as input arg.Ex. `oneshot(..., output_dir=...)`
chaiml-dragon-hunter-ar-26059-v1-uploader: Saving to /dev/shm/model_output...
chaiml-dragon-hunter-ar-26059-v1-uploader: 2026-03-09T18:22:51.954049-0700 | get_model_compressor | INFO - skip_sparsity_compression_stats set to True. Skipping sparsity compression statistic calculations. No sparsity compressor will be applied.
chaiml-dragon-hunter-ar-26059-v1-uploader: Cleaning quantization config in /dev/shm/model_output
chaiml-dragon-hunter-ar-26059-v1-uploader: Pushing to ChaiML/Dragon-Hunter_Arthur-Dragon-hunter-BL260310010641_sft-FP8
chaiml-dragon-hunter-ar-26059-v1-uploader: Checking if ChaiML/Dragon-Hunter_Arthur-Dragon-hunter-BL260310010641_sft-FP8 already exists in ChaiML
chaiml-dragon-hunter-ar-26059-v1-uploader: Creating repo ChaiML/Dragon-Hunter_Arthur-Dragon-hunter-BL260310010641_sft-FP8 and uploading /dev/shm/model_output to it
chaiml-dragon-hunter-ar-26059-v1-uploader: ---------- 2026-03-09 18:23:38 (0:00:00) ----------
chaiml-dragon-hunter-ar-26059-v1-uploader: Files: hashed 6/13 (276.1K/24.9G) | pre-uploaded: 0/0 (0.0/24.9G) (+13 unsure) | committed: 0/13 (0.0/24.9G) | ignored: 0
chaiml-dragon-hunter-ar-26059-v1-uploader: Workers: hashing: 7 | get upload mode: 4 | pre-uploading: 0 | committing: 0 | waiting: 115
chaiml-dragon-hunter-ar-26059-v1-uploader: ---------------------------------------------------
chaiml-dragon-hunter-ar-26059-v1-uploader:
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
chaiml-dragon-hunter-ar-26059-v1-uploader: ---------- 2026-03-09 18:24:38 (0:01:00) ----------
chaiml-dragon-hunter-ar-26059-v1-uploader: Files: hashed 13/13 (24.9G/24.9G) | pre-uploaded: 7/7 (24.9G/24.9G) | committed: 0/13 (0.0/24.9G) | ignored: 0
chaiml-dragon-hunter-ar-26059-v1-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 0 | committing: 1 | waiting: 125
chaiml-dragon-hunter-ar-26059-v1-uploader: ---------------------------------------------------
chaiml-dragon-hunter-ar-26059-v1-uploader: Processed model ChaiML/Dragon-Hunter_Arthur-Dragon-hunter-BL260310010641_sft in 245.785s
chaiml-dragon-hunter-ar-26059-v1-uploader: creating bucket guanaco-vllm-models
chaiml-dragon-hunter-ar-26059-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-dragon-hunter-ar-26059-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-dragon-hunter-ar-26059-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-dragon-hunter-ar-26059-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-dragon-hunter-ar-26059-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-dragon-hunter-ar-26059-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-dragon-hunter-ar-26059-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-dragon-hunter-ar-26059-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-dragon-hunter-ar-26059-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-dragon-hunter-ar-26059-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-dragon-hunter-ar-26059-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-dragon-hunter-ar-26059-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-dragon-hunter-ar-26059-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-dragon-hunter-ar-26059-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-dragon-hunter-ar-26059-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-dragon-hunter-ar-26059-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-dragon-hunter-ar-26059-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-dragon-hunter-ar-26059-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-dragon-hunter-ar-26059-v1/default
chaiml-dragon-hunter-ar-26059-v1-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-dragon-hunter-ar-26059-v1/default/model.safetensors.index.json
chaiml-dragon-hunter-ar-26059-v1-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-dragon-hunter-ar-26059-v1/default/recipe.yaml
chaiml-dragon-hunter-ar-26059-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-dragon-hunter-ar-26059-v1/default/config.json
chaiml-dragon-hunter-ar-26059-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-dragon-hunter-ar-26059-v1/default/tokenizer_config.json
chaiml-dragon-hunter-ar-26059-v1-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-dragon-hunter-ar-26059-v1/default/generation_config.json
chaiml-dragon-hunter-ar-26059-v1-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-dragon-hunter-ar-26059-v1/default/special_tokens_map.json
chaiml-dragon-hunter-ar-26059-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-dragon-hunter-ar-26059-v1/default/tokenizer.json
chaiml-dragon-hunter-ar-26059-v1-uploader: cp /dev/shm/model_output/model-00006-of-00006.safetensors s3://guanaco-vllm-models/chaiml-dragon-hunter-ar-26059-v1/default/model-00006-of-00006.safetensors
Job chaiml-dragon-hunter-ar-26059-v1-uploader completed after 349.37s with status: succeeded
Stopping job with name chaiml-dragon-hunter-ar-26059-v1-uploader
Pipeline stage VLLMUploader completed in 356.10s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.15s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-dragon-hunter-ar-26059-v1
Waiting for inference service chaiml-dragon-hunter-ar-26059-v1 to be ready
Inference service chaiml-dragon-hunter-ar-26059-v1 ready after 161.28379487991333s
Pipeline stage VLLMDeployer completed in 167.27s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 7.718207597732544s
Received healthy response to inference request in 3.2323172092437744s
Received healthy response to inference request in 7.796493291854858s
Received healthy response to inference request in 2.8128597736358643s
Received healthy response to inference request in 2.7205724716186523s
Received healthy response to inference request in 2.943082809448242s
Received healthy response to inference request in 2.790616750717163s
Received healthy response to inference request in 3.1850903034210205s
Received healthy response to inference request in 2.6878433227539062s
Received healthy response to inference request in 2.818591833114624s
Received healthy response to inference request in 2.7700934410095215s
Received healthy response to inference request in 3.3809969425201416s
Received healthy response to inference request in 2.804023265838623s
Received healthy response to inference request in 2.965604543685913s
Received healthy response to inference request in 3.7133936882019043s
Received healthy response to inference request in 2.800021171569824s
Received healthy response to inference request in 8.546771049499512s
Received healthy response to inference request in 2.735821485519409s
Received healthy response to inference request in 2.8451812267303467s
Received healthy response to inference request in 2.715272903442383s
Received healthy response to inference request in 2.6775238513946533s
Received healthy response to inference request in 2.709237813949585s
Received healthy response to inference request in 2.680840253829956s
Received healthy response to inference request in 2.9421374797821045s
Received healthy response to inference request in 2.7732136249542236s
Received healthy response to inference request in 2.7544915676116943s
Received healthy response to inference request in 2.745610237121582s
Received healthy response to inference request in 3.4029293060302734s
Received healthy response to inference request in 2.803107261657715s
Received healthy response to inference request in 2.793905735015869s
30 requests
0 failed requests
5th percentile: 2.6839916348457336
10th percentile: 2.707098364830017
20th percentile: 2.7327716827392576
30th percentile: 2.7654128789901735
40th percentile: 2.7925901412963867
50th percentile: 2.803565263748169
60th percentile: 2.829227590560913
70th percentile: 2.9498393297195435
80th percentile: 3.262053155899048
90th percentile: 4.113875079154974
95th percentile: 7.761264729499817
99th percentile: 8.329190499782563
mean time: 3.408861740430196
Pipeline stage StressChecker completed in 169.15s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 6.77s
Shutdown handler de-registered
chaiml-dragon-hunter-ar_26059_v1 status is now deployed due to DeploymentManager action
chaiml-dragon-hunter-ar_26059_v1 status is now inactive due to auto deactivation removed underperforming models