Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name mistralai-mistral-smal-88026-v70-uploader
Waiting for job on mistralai-mistral-smal-88026-v70-uploader to finish
mistralai-mistral-smal-88026-v70-uploader: Using quantization_mode: fp8
mistralai-mistral-smal-88026-v70-uploader: Checking if ChaiML/Mistral-Small-24B-Base-2501-FP8 already exists in ChaiML
mistralai-mistral-smal-88026-v70-uploader: Downloading snapshot of mistralai/Mistral-Small-24B-Base-2501...
2026-04-13T16:46:05.820949+00:00 monitor updated for mistralai-mistral-smal_88026_v70
mistralai-mistral-smal-88026-v70-uploader: Downloaded in 48.257s
mistralai-mistral-smal-88026-v70-uploader: Loading /tmp/model_input...
mistralai-mistral-smal-88026-v70-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
mistralai-mistral-smal-88026-v70-uploader: Applying quantization...
mistralai-mistral-smal-88026-v70-uploader: 2026-04-13T16:46:05.4085 | __init__ | WARNING - Disabling tokenizer parallelism due to threading conflict between FastTokenizer and Datasets. Set TOKENIZERS_PARALLELISM=false to suppress this warning.
mistralai-mistral-smal-88026-v70-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
mistralai-mistral-smal-88026-v70-uploader: 2026-04-13T16:46:06.2597 | reset | INFO - Compression lifecycle reset
mistralai-mistral-smal-88026-v70-uploader: 2026-04-13T16:46:06.2635 | from_modifiers | INFO - Creating recipe from modifiers
mistralai-mistral-smal-88026-v70-uploader: 2026-04-13T16:46:06.2925 | initialize | INFO - Compression lifecycle initialized for 1 modifiers
mistralai-mistral-smal-88026-v70-uploader: 2026-04-13T16:46:06.2927 | IndependentPipeline | INFO - Inferred `DataFreePipeline` for `QuantizationModifier`
mistralai-mistral-smal-88026-v70-uploader: 2026-04-13T16:46:06.3005 | dispatch_model | WARNING - Forced to offload modules due to insufficient gpu resources
mistralai-mistral-smal-88026-v70-uploader: 2026-04-13T16:46:13.1696 | finalize | INFO - Compression lifecycle finalized for 1 modifiers
mistralai-mistral-smal-88026-v70-uploader: 2026-04-13T16:46:13.1697 | post_process | WARNING - Optimized model is not saved. To save, please provide`output_dir` as input arg.Ex. `oneshot(..., output_dir=...)`
mistralai-mistral-smal-88026-v70-uploader: Saving to /tmp/model_output...
mistralai-mistral-smal-88026-v70-uploader: /usr/local/lib/python3.12/dist-packages/transformers/modeling_utils.py:3344: UserWarning: Attempting to save a model with offloaded modules. Ensure that unallocated cpu memory exceeds the `shard_size` (50GB default)
mistralai-mistral-smal-88026-v70-uploader: warnings.warn(
mistralai-mistral-smal-88026-v70-uploader: Updating config in /tmp/model_output
mistralai-mistral-smal-88026-v70-uploader: Pushing to ChaiML/Mistral-Small-24B-Base-2501-FP8
mistralai-mistral-smal-88026-v70-uploader: Checking if ChaiML/Mistral-Small-24B-Base-2501-FP8 already exists in ChaiML
mistralai-mistral-smal-88026-v70-uploader: Creating repo ChaiML/Mistral-Small-24B-Base-2501-FP8 and uploading /tmp/model_output to it
mistralai-mistral-smal-88026-v70-uploader: Found 1 files larger than 20GB (recommended limit):
mistralai-mistral-smal-88026-v70-uploader: - model.safetensors: 24.9GB
mistralai-mistral-smal-88026-v70-uploader: Large files may slow down loading and processing.
mistralai-mistral-smal-88026-v70-uploader: ---------- 2026-04-13 16:46:49 (0:00:00) ----------
mistralai-mistral-smal-88026-v70-uploader: Files: hashed 4/6 (2.7K/24.9G) | pre-uploaded: 0/0 (0.0/24.9G) (+6 unsure) | committed: 0/6 (0.0/24.9G) | ignored: 0
mistralai-mistral-smal-88026-v70-uploader: Workers: hashing: 2 | get upload mode: 4 | pre-uploading: 0 | committing: 0 | waiting: 58
mistralai-mistral-smal-88026-v70-uploader: ---------------------------------------------------
2026-04-13T16:47:05.919471+00:00 monitor updated for mistralai-mistral-smal_88026_v70
mistralai-mistral-smal-88026-v70-uploader:
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
mistralai-mistral-smal-88026-v70-uploader: ---------- 2026-04-13 16:47:49 (0:01:00) ----------
mistralai-mistral-smal-88026-v70-uploader: Files: hashed 6/6 (24.9G/24.9G) | pre-uploaded: 1/2 (17.1M/24.9G) | committed: 0/6 (0.0/24.9G) | ignored: 0
mistralai-mistral-smal-88026-v70-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 1 | committing: 0 | waiting: 63
mistralai-mistral-smal-88026-v70-uploader: ---------------------------------------------------
2026-04-13T16:48:06.023377+00:00 monitor updated for mistralai-mistral-smal_88026_v70
mistralai-mistral-smal-88026-v70-uploader: Processed model mistralai/Mistral-Small-24B-Base-2501 in 186.497s
Retrying (%r) after connection broken by '%r': %s
mistralai-mistral-smal-88026-v70-uploader: creating bucket guanaco-vllm-models
mistralai-mistral-smal-88026-v70-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
mistralai-mistral-smal-88026-v70-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
mistralai-mistral-smal-88026-v70-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
mistralai-mistral-smal-88026-v70-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
mistralai-mistral-smal-88026-v70-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
mistralai-mistral-smal-88026-v70-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
mistralai-mistral-smal-88026-v70-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
mistralai-mistral-smal-88026-v70-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
mistralai-mistral-smal-88026-v70-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
mistralai-mistral-smal-88026-v70-uploader: if re.search("-\.", bucket, re.UNICODE):
mistralai-mistral-smal-88026-v70-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
mistralai-mistral-smal-88026-v70-uploader: if re.search("\.\.", bucket, re.UNICODE):
mistralai-mistral-smal-88026-v70-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
mistralai-mistral-smal-88026-v70-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
mistralai-mistral-smal-88026-v70-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
mistralai-mistral-smal-88026-v70-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
mistralai-mistral-smal-88026-v70-uploader: Bucket 's3://guanaco-vllm-models/' created
mistralai-mistral-smal-88026-v70-uploader: uploading /tmp/model_output to s3://guanaco-vllm-models/mistralai-mistral-smal-88026-v70/default
mistralai-mistral-smal-88026-v70-uploader: cp /tmp/model_output/config.json s3://guanaco-vllm-models/mistralai-mistral-smal-88026-v70/default/config.json
mistralai-mistral-smal-88026-v70-uploader: cp /tmp/model_output/tokenizer_config.json s3://guanaco-vllm-models/mistralai-mistral-smal-88026-v70/default/tokenizer_config.json
mistralai-mistral-smal-88026-v70-uploader: cp /tmp/model_output/generation_config.json s3://guanaco-vllm-models/mistralai-mistral-smal-88026-v70/default/generation_config.json
mistralai-mistral-smal-88026-v70-uploader: cp /tmp/model_output/recipe.yaml s3://guanaco-vllm-models/mistralai-mistral-smal-88026-v70/default/recipe.yaml
mistralai-mistral-smal-88026-v70-uploader: cp /tmp/model_output/tokenizer.json s3://guanaco-vllm-models/mistralai-mistral-smal-88026-v70/default/tokenizer.json
2026-04-13T16:49:06.111019+00:00 monitor updated for mistralai-mistral-smal_88026_v70
mistralai-mistral-smal-88026-v70-uploader: cp /tmp/model_output/model.safetensors s3://guanaco-vllm-models/mistralai-mistral-smal-88026-v70/default/model.safetensors
Job mistralai-mistral-smal-88026-v70-uploader completed after 246.36s with status: succeeded
Stopping job with name mistralai-mistral-smal-88026-v70-uploader
Pipeline stage VLLMUploader completed in 246.94s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.10s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 2.55s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service mistralai-mistral-smal-88026-v70
Waiting for inference service mistralai-mistral-smal-88026-v70 to be ready
2026-04-13T16:50:06.211069+00:00 monitor updated for mistralai-mistral-smal_88026_v70
2026-04-13T16:51:06.307083+00:00 monitor updated for mistralai-mistral-smal_88026_v70
Inference service mistralai-mistral-smal-88026-v70 ready after 156.6083128452301s
Pipeline stage VLLMDeployer completed in 157.21s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 4.839813470840454s
Received healthy response to inference request in 4.606433391571045s
2026-04-13T16:52:06.420778+00:00 monitor updated for mistralai-mistral-smal_88026_v70
Received healthy response to inference request in 4.457385778427124s
Received healthy response to inference request in 3.214766263961792s
Received healthy response to inference request in 4.900352478027344s
Received healthy response to inference request in 2.67380428314209s
Received healthy response to inference request in 4.3335020542144775s
Received healthy response to inference request in 2.6699206829071045s
Received healthy response to inference request in 2.1154401302337646s
Received healthy response to inference request in 1.6563832759857178s
Received healthy response to inference request in 2.6326372623443604s
Received healthy response to inference request in 2.640610456466675s
Received healthy response to inference request in 2.884495258331299s
Received healthy response to inference request in 3.5676698684692383s
Received healthy response to inference request in 1.2159919738769531s
Received healthy response to inference request in 2.6691441535949707s
Received healthy response to inference request in 3.1254560947418213s
Received healthy response to inference request in 1.7356443405151367s
Received healthy response to inference request in 2.2541537284851074s
Received healthy response to inference request in 1.6378977298736572s
Received healthy response to inference request in 1.9321386814117432s
Received healthy response to inference request in 1.1585564613342285s
Received healthy response to inference request in 2.9518051147460938s
Received healthy response to inference request in 2.6892213821411133s
Received healthy response to inference request in 2.715233325958252s
2026-04-13T16:53:06.533178+00:00 monitor updated for mistralai-mistral-smal_88026_v70
Received healthy response to inference request in 2.9692628383636475s
Received healthy response to inference request in 1.8394296169281006s
Received healthy response to inference request in 1.0446102619171143s
Received healthy response to inference request in 2.5624172687530518s
Received healthy response to inference request in 2.701914072036743s
30 requests
0 failed requests
5th percentile: 1.1844024419784547
10th percentile: 1.595707154273987
20th percentile: 1.818672561645508
30th percentile: 2.2125396490097047
40th percentile: 2.637421178817749
50th percentile: 2.671862483024597
60th percentile: 2.7072417736053467
70th percentile: 2.95704243183136
80th percentile: 3.285346984863282
90th percentile: 4.4722905397415165
95th percentile: 4.734792435169219
99th percentile: 4.882796165943146
mean time: 2.746536389986674
Pipeline stage StressChecker completed in 85.61s
Shutdown handler de-registered
mistralai-mistral-smal_88026_v70 status is now deployed due to DeploymentManager action
mistralai-mistral-smal_88026_v70 status is now inactive due to system request
mistralai-mistral-smal_88026_v70 status is now torndown due to DeploymentManager action