Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-q235b-opus-v1-wi-41992-v1-uploader
Waiting for job on chaiml-q235b-opus-v1-wi-41992-v1-uploader to finish
chaiml-q235b-opus-v1-wi-41992-v1-uploader: Using quantization_mode: w4a16
chaiml-q235b-opus-v1-wi-41992-v1-uploader: Checking if ChaiML/q235b_opus_V1_with_rep_fix_again-step444-merged-W4A16 already exists in ChaiML
chaiml-q235b-opus-v1-wi-41992-v1-uploader: Downloading snapshot of ChaiML/q235b_opus_V1_with_rep_fix_again-step444-merged...
2026-04-03T07:01:32.445249+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
2026-04-03T07:02:32.955835+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
2026-04-03T07:03:33.065818+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
chaiml-q235b-opus-v1-wi-41992-v1-uploader: Downloaded in 163.055s
chaiml-q235b-opus-v1-wi-41992-v1-uploader: Applying quantization...
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:03:57 INFO __init__.py L202: Patched transformers.models.qwen3_moe.modeling_qwen3_moe.Qwen3MoeSparseMoeBlock -> auto_round.modeling.unfused_moe.qwen3_moe.LinearQwen3MoeSparseMoeBlock[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:04:16 INFO base.py L448: `enable_opt_rtn` is turned on, set `--disable_opt_rtn` for higher speed at the cost of accuracy.[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:04:16 INFO base.py L486: using torch.bfloat16 for quantization tuning[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:04:16 INFO base.py L1573: Using predefined ignore_layers: ['mlp.gate'][0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:04:19 INFO base.py L1081: start to compute imatrix[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
2026-04-03T07:04:33.155652+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [33;1m2026-04-03 07:04:51 WARNING base.py L1201: MoE layer detected: optimized RTN is disabled for efficiency. Use `--enable_opt_rtn` to force-enable it for MoE layers.[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:04:53 INFO device.py L1468: 'peak_ram': 19.12GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:05:05 INFO device.py L1468: 'peak_ram': 20.44GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:05:16 INFO device.py L1468: 'peak_ram': 21.75GB, 'peak_vram': 11.38GB[0m
2026-04-03T07:05:33.247410+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:05:33 INFO device.py L1468: 'peak_ram': 27.15GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:05:45 INFO device.py L1468: 'peak_ram': 27.15GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:05:58 INFO device.py L1468: 'peak_ram': 27.15GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:06:11 INFO device.py L1468: 'peak_ram': 27.15GB, 'peak_vram': 11.38GB[0m
2026-04-03T07:06:33.349804+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:06:28 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:06:40 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:06:50 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:07:01 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:07:15 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:07:26 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
2026-04-03T07:07:33.438278+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:07:37 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:07:48 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:08:01 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:08:08 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:08:14 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:08:21 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
2026-04-03T07:08:33.614325+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:08:31 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:08:39 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:08:47 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:08:54 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:09:03 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:09:10 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:09:16 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:09:23 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
2026-04-03T07:09:33.697503+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:09:32 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:09:39 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:09:45 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:09:52 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:10:01 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:10:08 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:10:14 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:10:21 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:10:30 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
2026-04-03T07:10:33.784514+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:10:37 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:10:43 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:10:50 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:10:59 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:11:05 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:11:12 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:11:21 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:11:28 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
2026-04-03T07:11:33.876687+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:11:34 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:11:40 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:11:50 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:11:56 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:12:03 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:12:09 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:12:19 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
2026-04-03T07:12:34.032678+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:12:25 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:12:32 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:12:38 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:12:47 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:12:54 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:13:00 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:13:06 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:13:16 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:13:22 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
2026-04-03T07:13:34.123820+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:13:28 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:13:35 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:13:44 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:13:51 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:13:57 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:14:03 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:14:13 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:14:19 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:14:25 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
2026-04-03T07:14:34.216786+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:14:32 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:14:41 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:14:47 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:14:54 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:15:00 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:15:09 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:15:16 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:15:22 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
2026-04-03T07:15:34.316771+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:15:28 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:15:38 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:15:44 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:15:50 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:15:59 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:16:06 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:16:12 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:16:18 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:16:28 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
2026-04-03T07:16:34.419272+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:16:34 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:16:41 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:16:47 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:16:56 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:17:03 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:17:09 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:17:15 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:17:25 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:17:28 INFO shard_writer.py L208: model has been saved to /dev/shm/model_output/[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [33;1m2026-04-03 07:17:28 WARNING export.py L336: /dev/shm/model_output already exists, this may cause model conflict[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: [38;20m2026-04-03 07:17:28 INFO device.py L1468: 'peak_ram': 28.02GB, 'peak_vram': 11.38GB[0m
chaiml-q235b-opus-v1-wi-41992-v1-uploader: Checking if ChaiML/q235b_opus_V1_with_rep_fix_again-step444-merged-W4A16 already exists in ChaiML
chaiml-q235b-opus-v1-wi-41992-v1-uploader: Creating repo ChaiML/q235b_opus_V1_with_rep_fix_again-step444-merged-W4A16 and uploading /dev/shm/model_output to it
chaiml-q235b-opus-v1-wi-41992-v1-uploader: ---------- 2026-04-03 07:17:29 (0:00:00) ----------
chaiml-q235b-opus-v1-wi-41992-v1-uploader: Files: hashed 7/32 (21.5M/131.9G) | pre-uploaded: 0/1 (0.0/131.9G) (+29 unsure) | committed: 0/32 (0.0/131.9G) | ignored: 0
chaiml-q235b-opus-v1-wi-41992-v1-uploader: Workers: hashing: 25 | get upload mode: 2 | pre-uploading: 1 | committing: 0 | waiting: 36
chaiml-q235b-opus-v1-wi-41992-v1-uploader: ---------------------------------------------------
2026-04-03T07:17:34.529954+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
chaiml-q235b-opus-v1-wi-41992-v1-uploader:
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
chaiml-q235b-opus-v1-wi-41992-v1-uploader: ---------- 2026-04-03 07:18:29 (0:01:00) ----------
chaiml-q235b-opus-v1-wi-41992-v1-uploader: Files: hashed 32/32 (131.9G/131.9G) | pre-uploaded: 9/26 (40.6G/131.9G) | committed: 0/32 (0.0/131.9G) | ignored: 0
chaiml-q235b-opus-v1-wi-41992-v1-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 17 | committing: 0 | waiting: 47
chaiml-q235b-opus-v1-wi-41992-v1-uploader: ---------------------------------------------------
2026-04-03T07:18:34.629956+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
chaiml-q235b-opus-v1-wi-41992-v1-uploader:
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
[K[F
chaiml-q235b-opus-v1-wi-41992-v1-uploader: ---------- 2026-04-03 07:19:29 (0:02:00) ----------
chaiml-q235b-opus-v1-wi-41992-v1-uploader: Files: hashed 32/32 (131.9G/131.9G) | pre-uploaded: 26/26 (131.9G/131.9G) | committed: 0/32 (0.0/131.9G) | ignored: 0
chaiml-q235b-opus-v1-wi-41992-v1-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 0 | committing: 1 | waiting: 63
chaiml-q235b-opus-v1-wi-41992-v1-uploader: ---------------------------------------------------
2026-04-03T07:19:34.740948+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
chaiml-q235b-opus-v1-wi-41992-v1-uploader: Processed model ChaiML/q235b_opus_V1_with_rep_fix_again-step444-merged in 1114.931s
chaiml-q235b-opus-v1-wi-41992-v1-uploader: creating bucket guanaco-vllm-models
chaiml-q235b-opus-v1-wi-41992-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-q235b-opus-v1-wi-41992-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-q235b-opus-v1-wi-41992-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-q235b-opus-v1-wi-41992-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-q235b-opus-v1-wi-41992-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-q235b-opus-v1-wi-41992-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-q235b-opus-v1-wi-41992-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-q235b-opus-v1-wi-41992-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-q235b-opus-v1-wi-41992-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-q235b-opus-v1-wi-41992-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-q235b-opus-v1-wi-41992-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-q235b-opus-v1-wi-41992-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-q235b-opus-v1-wi-41992-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-q235b-opus-v1-wi-41992-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-q235b-opus-v1-wi-41992-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-q235b-opus-v1-wi-41992-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-q235b-opus-v1-wi-41992-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-q235b-opus-v1-wi-41992-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/tokenizer_config.json
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/config.json
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/chat_template.jinja
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/quantization_config.json
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/generation_config.json
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model.safetensors.index.json
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/tokenizer.json
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00025-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00025-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00004-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00004-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00001-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00001-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00002-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00002-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00022-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00022-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00017-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00017-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00013-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00013-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00020-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00020-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00011-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00011-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00018-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00018-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00024-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00024-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00009-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00009-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00005-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00005-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00021-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00021-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00012-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00012-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00007-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00007-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00003-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00003-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00016-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00016-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00023-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00023-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00008-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00008-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00010-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00010-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00014-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00014-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00019-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00019-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00006-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00006-of-00025.safetensors
chaiml-q235b-opus-v1-wi-41992-v1-uploader: cp /dev/shm/model_output/model-00015-of-00025.safetensors s3://guanaco-vllm-models/chaiml-q235b-opus-v1-wi-41992-v1/default/model-00015-of-00025.safetensors
Job chaiml-q235b-opus-v1-wi-41992-v1-uploader completed after 1195.28s with status: succeeded
Stopping job with name chaiml-q235b-opus-v1-wi-41992-v1-uploader
Pipeline stage VLLMUploader completed in 1199.66s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.13s
run pipeline stage %s
2026-04-03T07:20:34.846111+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.77s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-q235b-opus-v1-wi-41992-v1
Waiting for inference service chaiml-q235b-opus-v1-wi-41992-v1 to be ready
Retrying (%r) after connection broken by '%r': %s
Failed to get request counts for guanaco-submitter. Falling back to default
2026-04-03T07:21:35.136450+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
2026-04-03T07:22:35.293487+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
2026-04-03T07:23:35.444416+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
2026-04-03T07:24:35.563480+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
Failed to get response for submission chaiml-qwen-bobo-dpo-ju_56781_v7: ('http://chaiml-qwen-bobo-dpo-ju-56781-v7-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'request timeout')
Inference service chaiml-q235b-opus-v1-wi-41992-v1 ready after 280.97233629226685s
Pipeline stage VLLMDeployer completed in 281.54s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.874018669128418s
Received healthy response to inference request in 1.4709217548370361s
Received healthy response to inference request in 1.9864368438720703s
Received healthy response to inference request in 1.6063299179077148s
Received healthy response to inference request in 1.739429235458374s
Received healthy response to inference request in 1.8835923671722412s
Received healthy response to inference request in 1.462559461593628s
Received healthy response to inference request in 1.818150520324707s
Retrying (%r) after connection broken by '%r': %s
2026-04-03T07:25:35.688835+00:00 monitor updated for chaiml-q235b-opus-v1-wi_41992_v1
Received healthy response to inference request in 1.5148470401763916s
Received healthy response to inference request in 1.5325605869293213s
Received healthy response to inference request in 1.4386951923370361s
Received healthy response to inference request in 2.1147143840789795s
Received healthy response to inference request in 1.5536446571350098s
Received healthy response to inference request in 1.6120891571044922s
Received healthy response to inference request in 1.592937707901001s
Received healthy response to inference request in 1.5382516384124756s
Received healthy response to inference request in 1.5569472312927246s
Received healthy response to inference request in 1.4514663219451904s
Received healthy response to inference request in 1.464444637298584s
Received healthy response to inference request in 1.536564588546753s
Received healthy response to inference request in 1.6565005779266357s
Received healthy response to inference request in 1.4913837909698486s
Received healthy response to inference request in 1.5776174068450928s
Received healthy response to inference request in 1.4807476997375488s
Received healthy response to inference request in 1.591184139251709s
Received healthy response to inference request in 1.4901280403137207s
Received healthy response to inference request in 1.7237379550933838s
Received healthy response to inference request in 1.5098133087158203s
Received healthy response to inference request in 1.7875020503997803s
Received healthy response to inference request in 1.7172839641571045s
30 requests
0 failed requests
5th percentile: 1.4564582347869872
10th percentile: 1.4642561197280883
20th percentile: 1.4882519721984864
30th percentile: 1.5133369207382201
40th percentile: 1.5375768184661864
50th percentile: 1.5672823190689087
60th percentile: 1.5982945919036866
70th percentile: 1.6747355937957762
80th percentile: 1.7490437984466554
90th percentile: 1.8938768148422243
95th percentile: 2.05698949098587
99th percentile: 3.3638204264640823
mean time: 1.6924833615620931
Pipeline stage StressChecker completed in 55.23s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 2.43s
Shutdown handler de-registered
chaiml-q235b-opus-v1-wi_41992_v1 status is now deployed due to DeploymentManager action
chaiml-q235b-opus-v1-wi_41992_v1 status is now inactive due to auto deactivation removed underperforming models
chaiml-q235b-opus-v1-wi_41992_v1 status is now torndown due to DeploymentManager action