Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-kasey-mean-older-8744-v1-uploader
Waiting for job on chaiml-kasey-mean-older-8744-v1-uploader to finish
chaiml-kasey-mean-older-8744-v1-uploader: Using quantization_mode: fp8
chaiml-kasey-mean-older-8744-v1-uploader: Checking if ChaiML/Kasey-Mean-older-cousin_Hannah-3_Your-260312041305_sft-FP8 already exists in ChaiML
chaiml-kasey-mean-older-8744-v1-uploader: Downloaded in 136.902s
chaiml-kasey-mean-older-8744-v1-uploader: Loading /tmp/model_input...
chaiml-kasey-mean-older-8744-v1-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
chaiml-kasey-mean-older-8744-v1-uploader: `torch_dtype` is deprecated! Use `dtype` instead!
chaiml-kasey-mean-older-8744-v1-uploader: Some parameters are on the meta device because they were offloaded to the cpu.
chaiml-kasey-mean-older-8744-v1-uploader: Applying quantization...
chaiml-kasey-mean-older-8744-v1-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
chaiml-kasey-mean-older-8744-v1-uploader: 2026-03-11T21:30:01.351005-0700 | reset | INFO - Compression lifecycle reset
chaiml-kasey-mean-older-8744-v1-uploader: 2026-03-11T21:30:01.351946-0700 | from_modifiers | INFO - Creating recipe from modifiers
chaiml-kasey-mean-older-8744-v1-uploader: 2026-03-11T21:30:01.431145-0700 | initialize | INFO - Compression lifecycle initialized for 1 modifiers
chaiml-kasey-mean-older-8744-v1-uploader: 2026-03-11T21:30:01.431414-0700 | IndependentPipeline | INFO - Inferred `DataFreePipeline` for `QuantizationModifier`
chaiml-kasey-mean-older-8744-v1-uploader: Some parameters are on the meta device because they were offloaded to the cpu.
chaiml-kasey-mean-older-8744-v1-uploader: 2026-03-11T21:30:31.050585-0700 | finalize | INFO - Compression lifecycle finalized for 1 modifiers
chaiml-kasey-mean-older-8744-v1-uploader: 2026-03-11T21:30:33.142456-0700 | post_process | WARNING - Optimized model is not saved. To save, please provide`output_dir` as input arg.Ex. `oneshot(..., output_dir=...)`
chaiml-kasey-mean-older-8744-v1-uploader: Saving to /dev/shm/model_output...
chaiml-kasey-mean-older-8744-v1-uploader: 2026-03-11T21:30:33.168806-0700 | get_model_compressor | INFO - skip_sparsity_compression_stats set to True. Skipping sparsity compression statistic calculations. No sparsity compressor will be applied.
chaiml-kasey-mean-older-8744-v1-uploader: Cleaning quantization config in /dev/shm/model_output
chaiml-kasey-mean-older-8744-v1-uploader: Pushing to ChaiML/Kasey-Mean-older-cousin_Hannah-3_Your-260312041305_sft-FP8
chaiml-kasey-mean-older-8744-v1-uploader: Checking if ChaiML/Kasey-Mean-older-cousin_Hannah-3_Your-260312041305_sft-FP8 already exists in ChaiML
chaiml-kasey-mean-older-8744-v1-uploader: Creating repo ChaiML/Kasey-Mean-older-cousin_Hannah-3_Your-260312041305_sft-FP8 and uploading /dev/shm/model_output to it
chaiml-kasey-mean-older-8744-v1-uploader: ---------- 2026-03-11 21:31:20 (0:00:00) ----------
chaiml-kasey-mean-older-8744-v1-uploader: Files: hashed 4/13 (274.2K/24.9G) | pre-uploaded: 0/0 (0.0/24.9G) (+13 unsure) | committed: 0/13 (0.0/24.9G) | ignored: 0
chaiml-kasey-mean-older-8744-v1-uploader: Workers: hashing: 13 | get upload mode: 0 | pre-uploading: 0 | committing: 0 | waiting: 113
chaiml-kasey-mean-older-8744-v1-uploader: ---------------------------------------------------
chaiml-kasey-mean-older-8744-v1-uploader:
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
chaiml-kasey-mean-older-8744-v1-uploader: ---------- 2026-03-11 21:32:20 (0:01:00) ----------
chaiml-kasey-mean-older-8744-v1-uploader: Files: hashed 13/13 (24.9G/24.9G) | pre-uploaded: 7/7 (24.9G/24.9G) | committed: 0/13 (0.0/24.9G) | ignored: 0
chaiml-kasey-mean-older-8744-v1-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 0 | committing: 1 | waiting: 125
chaiml-kasey-mean-older-8744-v1-uploader: ---------------------------------------------------
chaiml-kasey-mean-older-8744-v1-uploader: Processed model ChaiML/Kasey-Mean-older-cousin_Hannah-3_Your-260312041305_sft in 297.275s
chaiml-kasey-mean-older-8744-v1-uploader: creating bucket guanaco-vllm-models
chaiml-kasey-mean-older-8744-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-kasey-mean-older-8744-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-kasey-mean-older-8744-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-kasey-mean-older-8744-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-kasey-mean-older-8744-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-kasey-mean-older-8744-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-kasey-mean-older-8744-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-kasey-mean-older-8744-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-kasey-mean-older-8744-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-kasey-mean-older-8744-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-kasey-mean-older-8744-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-kasey-mean-older-8744-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-kasey-mean-older-8744-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-kasey-mean-older-8744-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-kasey-mean-older-8744-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-kasey-mean-older-8744-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-kasey-mean-older-8744-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-kasey-mean-older-8744-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-kasey-mean-older-8744-v1/default
chaiml-kasey-mean-older-8744-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-kasey-mean-older-8744-v1/default/config.json
chaiml-kasey-mean-older-8744-v1-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-kasey-mean-older-8744-v1/default/generation_config.json
chaiml-kasey-mean-older-8744-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-kasey-mean-older-8744-v1/default/tokenizer_config.json
chaiml-kasey-mean-older-8744-v1-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-kasey-mean-older-8744-v1/default/model.safetensors.index.json
chaiml-kasey-mean-older-8744-v1-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-kasey-mean-older-8744-v1/default/special_tokens_map.json
chaiml-kasey-mean-older-8744-v1-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-kasey-mean-older-8744-v1/default/recipe.yaml
chaiml-kasey-mean-older-8744-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-kasey-mean-older-8744-v1/default/tokenizer.json
chaiml-kasey-mean-older-8744-v1-uploader: cp /dev/shm/model_output/model-00006-of-00006.safetensors s3://guanaco-vllm-models/chaiml-kasey-mean-older-8744-v1/default/model-00006-of-00006.safetensors
Job chaiml-kasey-mean-older-8744-v1-uploader completed after 367.08s with status: succeeded
Stopping job with name chaiml-kasey-mean-older-8744-v1-uploader
Pipeline stage VLLMUploader completed in 372.66s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.72s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-kasey-mean-older-8744-v1
Waiting for inference service chaiml-kasey-mean-older-8744-v1 to be ready
Inference service chaiml-kasey-mean-older-8744-v1 ready after 150.60827732086182s
Pipeline stage VLLMDeployer completed in 152.92s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 4.413349390029907s
Received healthy response to inference request in 2.8922834396362305s
Received healthy response to inference request in 2.9584710597991943s
Received healthy response to inference request in 3.1465165615081787s
Received healthy response to inference request in 2.9144253730773926s
Received healthy response to inference request in 3.0363147258758545s
Received healthy response to inference request in 3.1104390621185303s
Received healthy response to inference request in 2.8548343181610107s
Received healthy response to inference request in 3.3341259956359863s
Received healthy response to inference request in 3.2650811672210693s
Received healthy response to inference request in 2.749911069869995s
Received healthy response to inference request in 3.728928327560425s
Received healthy response to inference request in 2.697676420211792s
Received healthy response to inference request in 2.687761068344116s
Received healthy response to inference request in 2.727928638458252s
Received healthy response to inference request in 2.695791006088257s
Received healthy response to inference request in 3.371997833251953s
Received healthy response to inference request in 3.0017356872558594s
Received healthy response to inference request in 2.7023894786834717s
Received healthy response to inference request in 2.6737117767333984s
Received healthy response to inference request in 3.1247925758361816s
Received healthy response to inference request in 2.7862439155578613s
Received healthy response to inference request in 3.1343374252319336s
Received healthy response to inference request in 3.059593915939331s
Received healthy response to inference request in 3.2670674324035645s
Received healthy response to inference request in 3.0371222496032715s
Received healthy response to inference request in 2.9969704151153564s
Received healthy response to inference request in 2.978567600250244s
Received healthy response to inference request in 2.7711727619171143s
Received healthy response to inference request in 3.3164021968841553s
30 requests
0 failed requests
5th percentile: 2.6913745403289795
10th percentile: 2.6974878787994383
20th percentile: 2.7455145835876467
30th percentile: 2.8342571973800657
40th percentile: 2.9408527851104735
50th percentile: 2.999353051185608
60th percentile: 3.046110916137695
70th percentile: 3.127656030654907
80th percentile: 3.265478420257568
90th percentile: 3.337913179397583
95th percentile: 3.5683096051216117
99th percentile: 4.214867281913758
mean time: 3.0478647629419964
Pipeline stage StressChecker completed in 117.66s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 2.63s
Shutdown handler de-registered
chaiml-kasey-mean-older-_8744_v1 status is now deployed due to DeploymentManager action
chaiml-kasey-mean-older-_8744_v1 status is now inactive due to auto deactivation removed underperforming models