Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-98p-2ff-chaiml-m-60819-v4-uploader
Waiting for job on chaiml-98p-2ff-chaiml-m-60819-v4-uploader to finish
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Using quantization_mode: fp8
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Checking if ChaiML/98p_2ff_chaiml_mistral_24b_2048_54327_v1_cp1250_merged-FP8 already exists in ChaiML
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Downloading snapshot of ChaiML/98p_2ff_chaiml_mistral_24b_2048_54327_v1_cp1250_merged...
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Downloaded in 57.390s
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Loading /tmp/model_input...
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: `torch_dtype` is deprecated! Use `dtype` instead!
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Some parameters are on the meta device because they were offloaded to the cpu.
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Applying quantization...
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: 2026-02-19T17:27:37.775088-0800 | reset | INFO - Compression lifecycle reset
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: 2026-02-19T17:27:37.776034-0800 | from_modifiers | INFO - Creating recipe from modifiers
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: 2026-02-19T17:27:37.867717-0800 | initialize | INFO - Compression lifecycle initialized for 1 modifiers
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: 2026-02-19T17:27:37.868009-0800 | IndependentPipeline | INFO - Inferred `DataFreePipeline` for `QuantizationModifier`
HTTP Request: %s %s "%s %d %s"
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Some parameters are on the meta device because they were offloaded to the cpu.
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: 2026-02-19T17:28:19.970182-0800 | finalize | INFO - Compression lifecycle finalized for 1 modifiers
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: 2026-02-19T17:28:22.185919-0800 | post_process | WARNING - Optimized model is not saved. To save, please provide`output_dir` as input arg.Ex. `oneshot(..., output_dir=...)`
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Saving to /dev/shm/model_output...
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: 2026-02-19T17:28:22.213049-0800 | get_model_compressor | INFO - skip_sparsity_compression_stats set to True. Skipping sparsity compression statistic calculations. No sparsity compressor will be applied.
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Cleaning quantization config in /dev/shm/model_output
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Pushing to ChaiML/98p_2ff_chaiml_mistral_24b_2048_54327_v1_cp1250_merged-FP8
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Checking if ChaiML/98p_2ff_chaiml_mistral_24b_2048_54327_v1_cp1250_merged-FP8 already exists in ChaiML
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Creating repo ChaiML/98p_2ff_chaiml_mistral_24b_2048_54327_v1_cp1250_merged-FP8 and uploading /dev/shm/model_output to it
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: ---------- 2026-02-19 17:29:14 (0:00:00) ----------
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Files: hashed 6/13 (276.0K/27.6G) | pre-uploaded: 0/0 (0.0/27.6G) (+13 unsure) | committed: 0/13 (0.0/27.6G) | ignored: 0
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Workers: hashing: 7 | get upload mode: 4 | pre-uploading: 0 | committing: 0 | waiting: 115
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: ---------------------------------------------------
chaiml-98p-2ff-chaiml-m-60819-v4-uploader:
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: ---------- 2026-02-19 17:30:14 (0:01:00) ----------
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Files: hashed 13/13 (27.6G/27.6G) | pre-uploaded: 7/7 (27.6G/27.6G) | committed: 0/13 (0.0/27.6G) | ignored: 0
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 0 | committing: 1 | waiting: 125
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: ---------------------------------------------------
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: creating bucket guanaco-vllm-models
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-60819-v4/default
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-60819-v4/default/generation_config.json
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-60819-v4/default/config.json
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-60819-v4/default/tokenizer_config.json
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-60819-v4/default/special_tokens_map.json
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-60819-v4/default/model.safetensors.index.json
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-60819-v4/default/recipe.yaml
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-60819-v4/default/tokenizer.json
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: cp /dev/shm/model_output/model-00006-of-00006.safetensors s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-60819-v4/default/model-00006-of-00006.safetensors
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: cp /dev/shm/model_output/model-00002-of-00006.safetensors s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-60819-v4/default/model-00002-of-00006.safetensors
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: cp /dev/shm/model_output/model-00003-of-00006.safetensors s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-60819-v4/default/model-00003-of-00006.safetensors
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: cp /dev/shm/model_output/model-00001-of-00006.safetensors s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-60819-v4/default/model-00001-of-00006.safetensors
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: cp /dev/shm/model_output/model-00004-of-00006.safetensors s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-60819-v4/default/model-00004-of-00006.safetensors
chaiml-98p-2ff-chaiml-m-60819-v4-uploader: cp /dev/shm/model_output/model-00005-of-00006.safetensors s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-60819-v4/default/model-00005-of-00006.safetensors
Job chaiml-98p-2ff-chaiml-m-60819-v4-uploader completed after 362.63s with status: succeeded
Stopping job with name chaiml-98p-2ff-chaiml-m-60819-v4-uploader
Pipeline stage VLLMUploader completed in 363.21s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.16s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-98p-2ff-chaiml-m-60819-v4
Waiting for inference service chaiml-98p-2ff-chaiml-m-60819-v4 to be ready
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Inference service chaiml-98p-2ff-chaiml-m-60819-v4 ready after 634.5472609996796s
Pipeline stage VLLMDeployer completed in 635.16s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.5062496662139893s
Received healthy response to inference request in 1.605628490447998s
Received healthy response to inference request in 1.6273374557495117s
Received healthy response to inference request in 1.4628846645355225s
Received healthy response to inference request in 1.8255867958068848s
Received healthy response to inference request in 1.535205602645874s
Received healthy response to inference request in 1.3703548908233643s
Received healthy response to inference request in 1.3709204196929932s
Received healthy response to inference request in 1.578054666519165s
Received healthy response to inference request in 1.5645203590393066s
Received healthy response to inference request in 1.4039640426635742s
Received healthy response to inference request in 1.411752462387085s
Received healthy response to inference request in 1.357792615890503s
Received healthy response to inference request in 1.4943318367004395s
Received healthy response to inference request in 1.3779850006103516s
Received healthy response to inference request in 1.4195661544799805s
Received healthy response to inference request in 1.4297270774841309s
Received healthy response to inference request in 1.5029792785644531s
Received healthy response to inference request in 1.5589840412139893s
Received healthy response to inference request in 1.4787342548370361s
Received healthy response to inference request in 1.4638001918792725s
Received healthy response to inference request in 1.4601032733917236s
Received healthy response to inference request in 1.5150718688964844s
Received healthy response to inference request in 1.4687402248382568s
Received healthy response to inference request in 1.4323155879974365s
Received healthy response to inference request in 1.5980846881866455s
Received healthy response to inference request in 1.3466098308563232s
Received healthy response to inference request in 1.458524465560913s
Received healthy response to inference request in 1.4594159126281738s
Received healthy response to inference request in 1.3599987030029297s
30 requests
0 failed requests
5th percentile: 1.3587853550910949
10th percentile: 1.3693192720413208
20th percentile: 1.3987682342529297
30th percentile: 1.4266788005828857
40th percentile: 1.4590593338012696
50th percentile: 1.4633424282073975
60th percentile: 1.4849732875823975
70th percentile: 1.5088963270187377
80th percentile: 1.5600913047790528
90th percentile: 1.5988390684127807
95th percentile: 1.6175684213638306
99th percentile: 1.7680944871902466
mean time: 1.4815074841181437
Pipeline stage StressChecker completed in 48.82s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.59s
Shutdown handler de-registered
chaiml-98p-2ff-chaiml-m_60819_v4 status is now deployed due to DeploymentManager action
chaiml-98p-2ff-chaiml-m_60819_v4 status is now inactive due to auto deactivation removed underperforming models
chaiml-98p-2ff-chaiml-m_60819_v4 status is now torndown due to DeploymentManager action