Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name mistralai-mistral-nem-93303-v614-uploader
Waiting for job on mistralai-mistral-nem-93303-v614-uploader to finish
mistralai-mistral-nem-93303-v614-uploader: Using quantization_mode: fp8
mistralai-mistral-nem-93303-v614-uploader: Checking if ChaiML/Mistral-Nemo-Instruct-2407-FP8 already exists in ChaiML
mistralai-mistral-nem-93303-v614-uploader: Downloading snapshot of mistralai/Mistral-Nemo-Instruct-2407...
mistralai-mistral-nem-93303-v614-uploader: Downloaded in 20.931s
mistralai-mistral-nem-93303-v614-uploader: Loading /tmp/model_input...
mistralai-mistral-nem-93303-v614-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
mistralai-mistral-nem-93303-v614-uploader: Applying quantization...
mistralai-mistral-nem-93303-v614-uploader: 2026-03-28T17:50:09.634134+0000 | __init__ | WARNING - Disabling tokenizer parallelism due to threading conflict between FastTokenizer and Datasets. Set TOKENIZERS_PARALLELISM=false to suppress this warning.
2026-03-28T17:50:16.516308+00:00 monitor updated for mistralai-mistral-nem_93303_v614
mistralai-mistral-nem-93303-v614-uploader: 2026-03-28T17:50:14.571645+0000 | finalize | INFO - Compression lifecycle finalized for 1 modifiers
mistralai-mistral-nem-93303-v614-uploader: 2026-03-28T17:50:14.571787+0000 | post_process | WARNING - Optimized model is not saved. To save, please provide`output_dir` as input arg.Ex. `oneshot(..., output_dir=...)`
mistralai-mistral-nem-93303-v614-uploader: Saving to /dev/shm/model_output...
mistralai-mistral-nem-93303-v614-uploader: /usr/local/lib/python3.12/dist-packages/transformers/modeling_utils.py:3344: UserWarning: Attempting to save a model with offloaded modules. Ensure that unallocated cpu memory exceeds the `shard_size` (50GB default)
mistralai-mistral-nem-93303-v614-uploader: warnings.warn(
mistralai-mistral-nem-93303-v614-uploader: Updating config in /dev/shm/model_output
mistralai-mistral-nem-93303-v614-uploader: Pushing to ChaiML/Mistral-Nemo-Instruct-2407-FP8
mistralai-mistral-nem-93303-v614-uploader: Checking if ChaiML/Mistral-Nemo-Instruct-2407-FP8 already exists in ChaiML
mistralai-mistral-nem-93303-v614-uploader: Creating repo ChaiML/Mistral-Nemo-Instruct-2407-FP8 and uploading /dev/shm/model_output to it
mistralai-mistral-nem-93303-v614-uploader: Processed model mistralai/Mistral-Nemo-Instruct-2407 in 87.934s
2026-03-28T17:51:17.342584+00:00 monitor updated for mistralai-mistral-nem_93303_v614
mistralai-mistral-nem-93303-v614-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
mistralai-mistral-nem-93303-v614-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
mistralai-mistral-nem-93303-v614-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
mistralai-mistral-nem-93303-v614-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
mistralai-mistral-nem-93303-v614-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
mistralai-mistral-nem-93303-v614-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
mistralai-mistral-nem-93303-v614-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
mistralai-mistral-nem-93303-v614-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
mistralai-mistral-nem-93303-v614-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
mistralai-mistral-nem-93303-v614-uploader: if re.search("-\.", bucket, re.UNICODE):
mistralai-mistral-nem-93303-v614-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
mistralai-mistral-nem-93303-v614-uploader: if re.search("\.\.", bucket, re.UNICODE):
mistralai-mistral-nem-93303-v614-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
mistralai-mistral-nem-93303-v614-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
mistralai-mistral-nem-93303-v614-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
mistralai-mistral-nem-93303-v614-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
mistralai-mistral-nem-93303-v614-uploader: Bucket 's3://guanaco-vllm-models/' created
mistralai-mistral-nem-93303-v614-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/mistralai-mistral-nem-93303-v614/default
mistralai-mistral-nem-93303-v614-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/mistralai-mistral-nem-93303-v614/default/chat_template.jinja
mistralai-mistral-nem-93303-v614-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/mistralai-mistral-nem-93303-v614/default/recipe.yaml
mistralai-mistral-nem-93303-v614-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/mistralai-mistral-nem-93303-v614/default/generation_config.json
mistralai-mistral-nem-93303-v614-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/mistralai-mistral-nem-93303-v614/default/config.json
mistralai-mistral-nem-93303-v614-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/mistralai-mistral-nem-93303-v614/default/tokenizer_config.json
mistralai-mistral-nem-93303-v614-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/mistralai-mistral-nem-93303-v614/default/tokenizer.json
mistralai-mistral-nem-93303-v614-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/mistralai-mistral-nem-93303-v614/default/model.safetensors
Job mistralai-mistral-nem-93303-v614-uploader completed after 170.34s with status: succeeded
Stopping job with name mistralai-mistral-nem-93303-v614-uploader
Pipeline stage VLLMUploader completed in 171.50s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.18s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.73s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service mistralai-mistral-nem-93303-v614
Waiting for inference service mistralai-mistral-nem-93303-v614 to be ready
2026-03-28T17:52:17.551920+00:00 monitor updated for mistralai-mistral-nem_93303_v614
2026-03-28T17:53:17.760352+00:00 monitor updated for mistralai-mistral-nem_93303_v614
2026-03-28T17:54:18.266398+00:00 monitor updated for mistralai-mistral-nem_93303_v614
Inference service mistralai-mistral-nem-93303-v614 ready after 172.22270941734314s
Pipeline stage VLLMDeployer completed in 173.46s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.3399291038513184s
Received healthy response to inference request in 3.4186809062957764s
Received healthy response to inference request in 3.2463648319244385s
2026-03-28T17:55:18.503747+00:00 monitor updated for mistralai-mistral-nem_93303_v614
Received healthy response to inference request in 3.340683937072754s
Received healthy response to inference request in 1.4764902591705322s
Received healthy response to inference request in 1.3766005039215088s
Received healthy response to inference request in 1.5869996547698975s
Received healthy response to inference request in 1.6269986629486084s
Received healthy response to inference request in 1.6269843578338623s
Received healthy response to inference request in 3.212114095687866s
Received healthy response to inference request in 1.8060543537139893s
Received healthy response to inference request in 1.752185583114624s
Received healthy response to inference request in 1.432237148284912s
Received healthy response to inference request in 1.6858458518981934s
Received healthy response to inference request in 1.638927698135376s
Received healthy response to inference request in 1.7236762046813965s
Received healthy response to inference request in 1.6118507385253906s
Received healthy response to inference request in 1.648949384689331s
Received healthy response to inference request in 1.6252093315124512s
Received healthy response to inference request in 1.6814136505126953s
Received healthy response to inference request in 1.7311770915985107s
Received healthy response to inference request in 1.7137441635131836s
Received healthy response to inference request in 1.209320068359375s
Received healthy response to inference request in 1.7666311264038086s
Received healthy response to inference request in 1.9079535007476807s
Received healthy response to inference request in 1.7906692028045654s
Received healthy response to inference request in 1.626020908355713s
Received healthy response to inference request in 1.2275850772857666s
Received healthy response to inference request in 1.6845526695251465s
Received healthy response to inference request in 2.1209185123443604s
30 requests
0 failed requests
5th percentile: 1.2946420192718506
10th percentile: 1.4266734838485717
20th percentile: 1.606880521774292
30th percentile: 1.6266953229904175
40th percentile: 1.6449407100677491
50th percentile: 1.68519926071167
60th percentile: 1.7266765594482423
70th percentile: 1.7738425493240355
80th percentile: 1.9505465030670173
90th percentile: 3.2557212591171267
95th percentile: 3.340344262123108
99th percentile: 3.3960617852211
mean time: 1.921225619316101
Pipeline stage StressChecker completed in 78.38s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.36s
Shutdown handler de-registered
mistralai-mistral-nem_93303_v614 status is now deployed due to DeploymentManager action
mistralai-mistral-nem_93303_v614 status is now inactive due to system request