Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-transfer-student-95693-v1-uploader
Waiting for job on chaiml-transfer-student-95693-v1-uploader to finish
chaiml-transfer-student-95693-v1-uploader: Using quantization_mode: fp8
chaiml-transfer-student-95693-v1-uploader: Checking if ChaiML/Transfer-student-rpg_Hannah-3_Pelin-Ba260321095207_sft-FP8 already exists in ChaiML
chaiml-transfer-student-95693-v1-uploader: Downloading snapshot of ChaiML/Transfer-student-rpg_Hannah-3_Pelin-Ba260321095207_sft...
chaiml-transfer-student-95693-v1-uploader: Downloaded in 87.595s
chaiml-transfer-student-95693-v1-uploader: Loading /tmp/model_input...
chaiml-transfer-student-95693-v1-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
chaiml-transfer-student-95693-v1-uploader: `torch_dtype` is deprecated! Use `dtype` instead!
chaiml-transfer-student-95693-v1-uploader: Some parameters are on the meta device because they were offloaded to the cpu.
chaiml-transfer-student-95693-v1-uploader: Applying quantization...
chaiml-transfer-student-95693-v1-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
chaiml-transfer-student-95693-v1-uploader: 2026-03-21T03:08:30.093496-0700 | reset | INFO - Compression lifecycle reset
chaiml-transfer-student-95693-v1-uploader: 2026-03-21T03:08:30.094388-0700 | from_modifiers | INFO - Creating recipe from modifiers
chaiml-transfer-student-95693-v1-uploader: 2026-03-21T03:08:30.160785-0700 | initialize | INFO - Compression lifecycle initialized for 1 modifiers
chaiml-transfer-student-95693-v1-uploader: 2026-03-21T03:08:30.161046-0700 | IndependentPipeline | INFO - Inferred `DataFreePipeline` for `QuantizationModifier`
chaiml-transfer-student-95693-v1-uploader: Some parameters are on the meta device because they were offloaded to the cpu.
chaiml-transfer-student-95693-v1-uploader: 2026-03-21T03:08:59.655273-0700 | finalize | INFO - Compression lifecycle finalized for 1 modifiers
chaiml-transfer-student-95693-v1-uploader: 2026-03-21T03:09:01.749136-0700 | post_process | WARNING - Optimized model is not saved. To save, please provide`output_dir` as input arg.Ex. `oneshot(..., output_dir=...)`
chaiml-transfer-student-95693-v1-uploader: Saving to /dev/shm/model_output...
chaiml-transfer-student-95693-v1-uploader: 2026-03-21T03:09:01.776066-0700 | get_model_compressor | INFO - skip_sparsity_compression_stats set to True. Skipping sparsity compression statistic calculations. No sparsity compressor will be applied.
chaiml-transfer-student-95693-v1-uploader: Cleaning quantization config in /dev/shm/model_output
chaiml-transfer-student-95693-v1-uploader: Pushing to ChaiML/Transfer-student-rpg_Hannah-3_Pelin-Ba260321095207_sft-FP8
chaiml-transfer-student-95693-v1-uploader: Checking if ChaiML/Transfer-student-rpg_Hannah-3_Pelin-Ba260321095207_sft-FP8 already exists in ChaiML
chaiml-transfer-student-95693-v1-uploader: Creating repo ChaiML/Transfer-student-rpg_Hannah-3_Pelin-Ba260321095207_sft-FP8 and uploading /dev/shm/model_output to it
chaiml-transfer-student-95693-v1-uploader:
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
chaiml-transfer-student-95693-v1-uploader: ---------- 2026-03-21 03:10:49 (0:01:00) ----------
chaiml-transfer-student-95693-v1-uploader: Files: hashed 13/13 (24.9G/24.9G) | pre-uploaded: 7/7 (24.9G/24.9G) | committed: 0/13 (0.0/24.9G) | ignored: 0
chaiml-transfer-student-95693-v1-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 0 | committing: 1 | waiting: 125
chaiml-transfer-student-95693-v1-uploader: ---------------------------------------------------
chaiml-transfer-student-95693-v1-uploader: Processed model ChaiML/Transfer-student-rpg_Hannah-3_Pelin-Ba260321095207_sft in 250.138s
chaiml-transfer-student-95693-v1-uploader: creating bucket guanaco-vllm-models
chaiml-transfer-student-95693-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-transfer-student-95693-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-transfer-student-95693-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-transfer-student-95693-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-transfer-student-95693-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-transfer-student-95693-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-transfer-student-95693-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-transfer-student-95693-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-transfer-student-95693-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-transfer-student-95693-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-transfer-student-95693-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-transfer-student-95693-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-transfer-student-95693-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-transfer-student-95693-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-transfer-student-95693-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-transfer-student-95693-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-transfer-student-95693-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-transfer-student-95693-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-transfer-student-95693-v1/default
chaiml-transfer-student-95693-v1-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-transfer-student-95693-v1/default/recipe.yaml
chaiml-transfer-student-95693-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-transfer-student-95693-v1/default/config.json
chaiml-transfer-student-95693-v1-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-transfer-student-95693-v1/default/model.safetensors.index.json
chaiml-transfer-student-95693-v1-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-transfer-student-95693-v1/default/generation_config.json
chaiml-transfer-student-95693-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-transfer-student-95693-v1/default/tokenizer_config.json
chaiml-transfer-student-95693-v1-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-transfer-student-95693-v1/default/special_tokens_map.json
chaiml-transfer-student-95693-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-transfer-student-95693-v1/default/tokenizer.json
chaiml-transfer-student-95693-v1-uploader: cp /dev/shm/model_output/model-00006-of-00006.safetensors s3://guanaco-vllm-models/chaiml-transfer-student-95693-v1/default/model-00006-of-00006.safetensors
chaiml-transfer-student-95693-v1-uploader: cp /dev/shm/model_output/model-00005-of-00006.safetensors s3://guanaco-vllm-models/chaiml-transfer-student-95693-v1/default/model-00005-of-00006.safetensors
chaiml-transfer-student-95693-v1-uploader: cp /dev/shm/model_output/model-00004-of-00006.safetensors s3://guanaco-vllm-models/chaiml-transfer-student-95693-v1/default/model-00004-of-00006.safetensors
chaiml-transfer-student-95693-v1-uploader: cp /dev/shm/model_output/model-00001-of-00006.safetensors s3://guanaco-vllm-models/chaiml-transfer-student-95693-v1/default/model-00001-of-00006.safetensors
chaiml-transfer-student-95693-v1-uploader: cp /dev/shm/model_output/model-00002-of-00006.safetensors s3://guanaco-vllm-models/chaiml-transfer-student-95693-v1/default/model-00002-of-00006.safetensors
chaiml-transfer-student-95693-v1-uploader: cp /dev/shm/model_output/model-00003-of-00006.safetensors s3://guanaco-vllm-models/chaiml-transfer-student-95693-v1/default/model-00003-of-00006.safetensors
Job chaiml-transfer-student-95693-v1-uploader completed after 310.47s with status: succeeded
Stopping job with name chaiml-transfer-student-95693-v1-uploader
Pipeline stage VLLMUploader completed in 313.79s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.18s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-transfer-student-95693-v1
Waiting for inference service chaiml-transfer-student-95693-v1 to be ready
Inference service chaiml-transfer-student-95693-v1 ready after 161.2275252342224s
Pipeline stage VLLMDeployer completed in 161.66s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.0511016845703125s
Received healthy response to inference request in 3.0773584842681885s
Received healthy response to inference request in 3.586632013320923s
Received healthy response to inference request in 3.0274083614349365s
Received healthy response to inference request in 2.9736602306365967s
Received healthy response to inference request in 3.0096492767333984s
Received healthy response to inference request in 2.6867451667785645s
Received healthy response to inference request in 2.6591570377349854s
Received healthy response to inference request in 2.7286434173583984s
Received healthy response to inference request in 3.137413501739502s
Received healthy response to inference request in 2.7802178859710693s
Received healthy response to inference request in 3.713225841522217s
Received healthy response to inference request in 2.7905266284942627s
Received healthy response to inference request in 2.904681444168091s
Received healthy response to inference request in 3.3814635276794434s
Received healthy response to inference request in 2.816016674041748s
Received healthy response to inference request in 2.8098912239074707s
Received healthy response to inference request in 3.398853302001953s
Received healthy response to inference request in 2.8206725120544434s
Received healthy response to inference request in 3.6298296451568604s
Received healthy response to inference request in 3.131575107574463s
Received healthy response to inference request in 2.889493465423584s
Received healthy response to inference request in 2.705512285232544s
Received healthy response to inference request in 2.73710298538208s
Received healthy response to inference request in 3.169067859649658s
Received healthy response to inference request in 3.0818116664886475s
Received healthy response to inference request in 2.8598148822784424s
Received healthy response to inference request in 2.7528300285339355s
Received healthy response to inference request in 2.8755717277526855s
30 requests
1 failed requests
5th percentile: 2.6951903700828552
10th percentile: 2.726330304145813
20th percentile: 2.7747403144836427
30th percentile: 2.814179039001465
40th percentile: 2.8692689895629884
50th percentile: 2.9391708374023438
60th percentile: 3.036885690689087
70th percentile: 3.0967406988143917
80th percentile: 3.2115469932556158
90th percentile: 3.5909517765045167
95th percentile: 3.675697553157806
99th percentile: 15.428899226188673
mean time: 3.5800034046173095
%s, retrying in %s seconds...
Received healthy response to inference request in 2.764397382736206s
Received healthy response to inference request in 2.876922607421875s
Received healthy response to inference request in 2.757033586502075s
Received healthy response to inference request in 3.0249814987182617s
Received healthy response to inference request in 2.798828601837158s
Received healthy response to inference request in 3.5919809341430664s
Received healthy response to inference request in 2.8802506923675537s
Received healthy response to inference request in 3.083493232727051s
Received healthy response to inference request in 3.006319761276245s
Received healthy response to inference request in 3.6103334426879883s
Received healthy response to inference request in 3.0323712825775146s
Received healthy response to inference request in 2.7903053760528564s
Received healthy response to inference request in 3.706120729446411s
Received healthy response to inference request in 2.901287078857422s
Received healthy response to inference request in 3.9984278678894043s
Received healthy response to inference request in 2.903550863265991s
Received healthy response to inference request in 2.8376851081848145s
Received healthy response to inference request in 2.715460777282715s
Received healthy response to inference request in 3.441645860671997s
Received healthy response to inference request in 3.275933027267456s
Received healthy response to inference request in 2.8579485416412354s
Received healthy response to inference request in 2.7168819904327393s
Received healthy response to inference request in 3.2151410579681396s
Received healthy response to inference request in 2.721165657043457s
Received healthy response to inference request in 2.917938232421875s
Received healthy response to inference request in 2.681053638458252s
Received healthy response to inference request in 3.273789405822754s
Received healthy response to inference request in 2.7297825813293457s
Received healthy response to inference request in 2.980543375015259s
Received healthy response to inference request in 2.7162742614746094s
30 requests
0 failed requests
5th percentile: 2.7158268451690675
10th percentile: 2.716821217536926
20th percentile: 2.751583385467529
30th percentile: 2.796271634101868
40th percentile: 2.8693329811096193
50th percentile: 2.9024189710617065
60th percentile: 2.9908539295196532
70th percentile: 3.047707867622375
80th percentile: 3.274218130111694
90th percentile: 3.5938161849975585
95th percentile: 3.6630164504051206
99th percentile: 3.9136587977409367
mean time: 3.0269282817840577
Pipeline stage StressChecker completed in 219.56s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.80s
Shutdown handler de-registered
chaiml-transfer-student_95693_v1 status is now deployed due to DeploymentManager action
chaiml-transfer-student_95693_v1 status is now inactive due to auto deactivation removed underperforming models
chaiml-transfer-student_95693_v1 status is now torndown due to DeploymentManager action