Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-hayden-drummer-b-17241-v1-uploader
Waiting for job on chaiml-hayden-drummer-b-17241-v1-uploader to finish
chaiml-hayden-drummer-b-17241-v1-uploader: Using quantization_mode: fp8
chaiml-hayden-drummer-b-17241-v1-uploader: Checking if ChaiML/Hayden-Drummer-BF_Enzo-schoolband-drum260314022435_sft-FP8 already exists in ChaiML
chaiml-hayden-drummer-b-17241-v1-uploader: Downloading snapshot of ChaiML/Hayden-Drummer-BF_Enzo-schoolband-drum260314022435_sft...
chaiml-hayden-drummer-b-17241-v1-uploader: Downloaded in 136.711s
chaiml-hayden-drummer-b-17241-v1-uploader: Loading /tmp/model_input...
chaiml-hayden-drummer-b-17241-v1-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
chaiml-hayden-drummer-b-17241-v1-uploader: `torch_dtype` is deprecated! Use `dtype` instead!
chaiml-hayden-drummer-b-17241-v1-uploader: Some parameters are on the meta device because they were offloaded to the cpu.
chaiml-hayden-drummer-b-17241-v1-uploader: Applying quantization...
chaiml-hayden-drummer-b-17241-v1-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
chaiml-hayden-drummer-b-17241-v1-uploader: 2026-03-13T19:42:39.640019-0700 | reset | INFO - Compression lifecycle reset
chaiml-hayden-drummer-b-17241-v1-uploader: 2026-03-13T19:42:39.640957-0700 | from_modifiers | INFO - Creating recipe from modifiers
chaiml-hayden-drummer-b-17241-v1-uploader: 2026-03-13T19:42:39.707726-0700 | initialize | INFO - Compression lifecycle initialized for 1 modifiers
chaiml-hayden-drummer-b-17241-v1-uploader: 2026-03-13T19:42:39.707993-0700 | IndependentPipeline | INFO - Inferred `DataFreePipeline` for `QuantizationModifier`
chaiml-hayden-drummer-b-17241-v1-uploader: 2026-03-13T19:43:09.424487-0700 | finalize | INFO - Compression lifecycle finalized for 1 modifiers
chaiml-hayden-drummer-b-17241-v1-uploader: Cleaning quantization config in /dev/shm/model_output
chaiml-hayden-drummer-b-17241-v1-uploader: Pushing to ChaiML/Hayden-Drummer-BF_Enzo-schoolband-drum260314022435_sft-FP8
chaiml-hayden-drummer-b-17241-v1-uploader: Checking if ChaiML/Hayden-Drummer-BF_Enzo-schoolband-drum260314022435_sft-FP8 already exists in ChaiML
chaiml-hayden-drummer-b-17241-v1-uploader: Creating repo ChaiML/Hayden-Drummer-BF_Enzo-schoolband-drum260314022435_sft-FP8 and uploading /dev/shm/model_output to it
chaiml-hayden-drummer-b-17241-v1-uploader: ---------- 2026-03-13 19:44:00 (0:00:00) ----------
chaiml-hayden-drummer-b-17241-v1-uploader: Files: hashed 4/13 (274.2K/24.9G) | pre-uploaded: 0/0 (0.0/24.9G) (+13 unsure) | committed: 0/13 (0.0/24.9G) | ignored: 0
chaiml-hayden-drummer-b-17241-v1-uploader: Workers: hashing: 12 | get upload mode: 1 | pre-uploading: 0 | committing: 0 | waiting: 113
chaiml-hayden-drummer-b-17241-v1-uploader: ---------------------------------------------------
chaiml-hayden-drummer-b-17241-v1-uploader:
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
chaiml-hayden-drummer-b-17241-v1-uploader: ---------- 2026-03-13 19:45:00 (0:01:00) ----------
chaiml-hayden-drummer-b-17241-v1-uploader: Files: hashed 13/13 (24.9G/24.9G) | pre-uploaded: 7/7 (24.9G/24.9G) | committed: 0/13 (0.0/24.9G) | ignored: 0
chaiml-hayden-drummer-b-17241-v1-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 0 | committing: 1 | waiting: 125
chaiml-hayden-drummer-b-17241-v1-uploader: ---------------------------------------------------
chaiml-hayden-drummer-b-17241-v1-uploader: cp /dev/shm/model_output/model-00006-of-00006.safetensors s3://guanaco-vllm-models/chaiml-hayden-drummer-b-17241-v1/default/model-00006-of-00006.safetensors
chaiml-hayden-drummer-b-17241-v1-uploader: cp /dev/shm/model_output/model-00001-of-00006.safetensors s3://guanaco-vllm-models/chaiml-hayden-drummer-b-17241-v1/default/model-00001-of-00006.safetensors
chaiml-hayden-drummer-b-17241-v1-uploader: cp /dev/shm/model_output/model-00005-of-00006.safetensors s3://guanaco-vllm-models/chaiml-hayden-drummer-b-17241-v1/default/model-00005-of-00006.safetensors
Job chaiml-hayden-drummer-b-17241-v1-uploader completed after 376.45s with status: succeeded
Stopping job with name chaiml-hayden-drummer-b-17241-v1-uploader
Pipeline stage VLLMUploader completed in 383.64s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 2.04s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-hayden-drummer-b-17241-v1
Waiting for inference service chaiml-hayden-drummer-b-17241-v1 to be ready
Inference service chaiml-hayden-drummer-b-17241-v1 ready after 152.2389621734619s
Pipeline stage VLLMDeployer completed in 158.53s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.899433135986328s
Received healthy response to inference request in 3.983703374862671s
Received healthy response to inference request in 3.2878289222717285s
Received healthy response to inference request in 2.8531179428100586s
Received healthy response to inference request in 2.8405191898345947s
Received healthy response to inference request in 3.1011202335357666s
Received healthy response to inference request in 2.761976718902588s
Received healthy response to inference request in 2.891879081726074s
Received healthy response to inference request in 3.023132562637329s
Received healthy response to inference request in 3.1105332374572754s
Received healthy response to inference request in 2.6769473552703857s
Received healthy response to inference request in 3.0207931995391846s
Received healthy response to inference request in 3.5969557762145996s
Received healthy response to inference request in 2.895739793777466s
Received healthy response to inference request in 3.039435625076294s
Received healthy response to inference request in 2.828809976577759s
Received healthy response to inference request in 2.9298276901245117s
Received healthy response to inference request in 2.7429914474487305s
Received healthy response to inference request in 3.4552853107452393s
Received healthy response to inference request in 3.2366795539855957s
Received healthy response to inference request in 3.1357972621917725s
Received healthy response to inference request in 2.8493289947509766s
Received healthy response to inference request in 2.6849629878997803s
Received healthy response to inference request in 2.7001054286956787s
Received healthy response to inference request in 3.2058916091918945s
Received healthy response to inference request in 3.3581655025482178s
Received healthy response to inference request in 2.8156912326812744s
Received healthy response to inference request in 3.3341448307037354s
Received healthy response to inference request in 2.69541335105896s
30 requests
1 failed requests
5th percentile: 2.689665651321411
10th percentile: 2.699636220932007
20th percentile: 2.804948329925537
30th percentile: 2.846686053276062
40th percentile: 2.8941955089569094
50th percentile: 3.021962881088257
60th percentile: 3.10488543510437
70th percentile: 3.2151279926300047
80th percentile: 3.338948965072632
90th percentile: 3.6272035121917727
95th percentile: 3.945781767368316
99th percentile: 18.44053496122362
mean time: 3.77672164440155
%s, retrying in %s seconds...
Received healthy response to inference request in 2.785757541656494s
Received healthy response to inference request in 3.064948558807373s
Received healthy response to inference request in 3.9121079444885254s
Received healthy response to inference request in 3.1596508026123047s
Received healthy response to inference request in 2.848933458328247s
Received healthy response to inference request in 2.852691888809204s
Received healthy response to inference request in 2.681175708770752s
Received healthy response to inference request in 3.409163236618042s
Received healthy response to inference request in 3.30972957611084s
Received healthy response to inference request in 3.302250385284424s
Received healthy response to inference request in 2.6863632202148438s
Received healthy response to inference request in 2.691523313522339s
Received healthy response to inference request in 3.420646905899048s
Received healthy response to inference request in 2.723426580429077s
Received healthy response to inference request in 2.9559972286224365s
Received healthy response to inference request in 3.9755842685699463s
Received healthy response to inference request in 3.3111658096313477s
Received healthy response to inference request in 3.1981003284454346s
Received healthy response to inference request in 3.4472713470458984s
Received healthy response to inference request in 3.362182855606079s
Received healthy response to inference request in 2.827792167663574s
Received healthy response to inference request in 3.357754707336426s
Received healthy response to inference request in 3.2173032760620117s
Received healthy response to inference request in 2.7904434204101562s
Received healthy response to inference request in 3.5639121532440186s
Received healthy response to inference request in 3.783008575439453s
Received healthy response to inference request in 3.069612741470337s
Received healthy response to inference request in 3.59128999710083s
Received healthy response to inference request in 2.8235199451446533s
Received healthy response to inference request in 2.8722469806671143s
30 requests
0 failed requests
5th percentile: 2.6886852622032165
10th percentile: 2.7202362537384035
20th percentile: 2.816904640197754
30th percentile: 2.851564359664917
40th percentile: 3.0213680267333984
50th percentile: 3.1788755655288696
60th percentile: 3.3052420616149902
70th percentile: 3.359083151817322
80th percentile: 3.4259717941284182
90th percentile: 3.6104618549346927
95th percentile: 3.8540132284164423
99th percentile: 3.957176134586334
mean time: 3.166518497467041
Pipeline stage StressChecker completed in 367.14s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 6.32s
Shutdown handler de-registered
chaiml-hayden-drummer-b_17241_v1 status is now deployed due to DeploymentManager action
chaiml-hayden-drummer-b_17241_v1 status is now inactive due to auto deactivation removed underperforming models