developer_uid: zonemercy
submission_id: chaiml-pony-v3a-g47-lr1_18022_v1
model_name: chaiml-pony-v3a-g47-lr1_18022_v1
model_group: ChaiML/pony-v3a-g47-lr1e
status: torndown
timestamp: 2026-04-09T17:21:34+00:00
num_battles: 11265
num_wins: 5748
celo_rating: 1313.08
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: basic
model_repo: ChaiML/pony-v3a-g47-lr1e5ep1r64b32
model_architecture: Glm4MoeForCausalLM
model_num_parameters: 24110003200.0
best_of: 8
max_input_tokens: 2048
max_output_tokens: 80
reward_model: default
display_name: chaiml-pony-v3a-g47-lr1_18022_v1
ineligible_reason: max_output_tokens!=64
is_internal_developer: True
language_model: ChaiML/pony-v3a-g47-lr1e5ep1r64b32
model_size: 24B
ranking_group: single
us_pacific_date: 2026-04-06
win_ratio: 0.5102529960053263
generation_params: {'temperature': 1.0, 'top_p': 0.95, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['</s>', '<|user|>', '<|assistant|>', '<|im_end|>', '####'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 80}
formatter: {'memory_template': "[gMASK]<sop><|system|>\n{bot_name}'s persona: {memory}", 'prompt_template': '', 'bot_template': '<|assistant|>{bot_name}: {message}', 'user_template': '<|user|>{message}', 'response_template': '<|assistant|></think>{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v3a-g47-lr1-18022-v1-uploader
Waiting for job on chaiml-pony-v3a-g47-lr1-18022-v1-uploader to finish
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: Using quantization_mode: w4a16
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: Checking if ChaiML/pony-v3a-g47-lr1e5ep1r64b32-W4A16 already exists in ChaiML
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: Downloading snapshot of ChaiML/pony-v3a-g47-lr1e5ep1r64b32...
2026-04-06T03:57:07.800300+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T03:58:07.889624+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T03:59:08.377363+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:00:08.456867+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: Downloaded in 218.653s
2026-04-06T04:01:08.543665+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:02:08.631688+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:03:08.719296+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:04:08.799138+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:05:08.888352+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: Applying quantization...
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:05:42 INFO __init__.py L202: Patched transformers.models.glm4_moe.modeling_glm4_moe.Glm4MoeMoE -> auto_round.modeling.unfused_moe.glm_moe.LinearGlm4MoeMoE
2026-04-06T04:06:08.973929+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:06:19 INFO base.py L486: using torch.bfloat16 for quantization tuning
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:06:38 INFO device.py L1468: 'peak_ram': 16.57GB, 'peak_vram': 1.44GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:06:49 INFO device.py L1468: 'peak_ram': 21.02GB, 'peak_vram': 1.59GB
2026-04-06T04:07:09.071370+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:07:09 INFO device.py L1468: 'peak_ram': 21.65GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:07:23 INFO device.py L1468: 'peak_ram': 24.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:07:40 INFO device.py L1468: 'peak_ram': 24.46GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:07:52 INFO device.py L1468: 'peak_ram': 26.88GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:08:03 INFO device.py L1468: 'peak_ram': 26.88GB, 'peak_vram': 1.59GB
2026-04-06T04:08:09.151814+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:08:17 INFO device.py L1468: 'peak_ram': 27.42GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:08:28 INFO device.py L1468: 'peak_ram': 27.42GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:08:43 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:08:52 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:08:59 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
2026-04-06T04:09:09.236122+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:09:08 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:09:14 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:09:19 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:09:27 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:09:33 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:09:41 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:09:47 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:09:52 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:10:01 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:10:07 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
2026-04-06T04:10:09.329548+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:10:16 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:10:21 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:10:28 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:10:37 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:10:45 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:10:53 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:11:08 INFO device.py L1468: 'peak_ram': 28.07GB, 'peak_vram': 1.59GB
2026-04-06T04:11:09.423570+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:11:19 INFO device.py L1468: 'peak_ram': 28.85GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:11:34 INFO device.py L1468: 'peak_ram': 32.16GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:11:45 INFO device.py L1468: 'peak_ram': 32.16GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:11:55 INFO device.py L1468: 'peak_ram': 32.52GB, 'peak_vram': 1.59GB
2026-04-06T04:12:09.530293+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:12:09 INFO device.py L1468: 'peak_ram': 32.52GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:12:20 INFO device.py L1468: 'peak_ram': 33.34GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:12:32 INFO device.py L1468: 'peak_ram': 33.34GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:12:45 INFO device.py L1468: 'peak_ram': 34.22GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:12:56 INFO device.py L1468: 'peak_ram': 34.22GB, 'peak_vram': 1.59GB
2026-04-06T04:13:09.621413+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:13:11 INFO device.py L1468: 'peak_ram': 34.22GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:13:23 INFO device.py L1468: 'peak_ram': 34.22GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:13:34 INFO device.py L1468: 'peak_ram': 34.22GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:13:48 INFO device.py L1468: 'peak_ram': 34.22GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:14:00 INFO device.py L1468: 'peak_ram': 34.22GB, 'peak_vram': 1.59GB
2026-04-06T04:14:09.708901+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:14:15 INFO device.py L1468: 'peak_ram': 34.22GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:14:32 INFO device.py L1468: 'peak_ram': 34.22GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:14:47 INFO device.py L1468: 'peak_ram': 34.22GB, 'peak_vram': 1.59GB
Failed to get request counts for guanaco-submitter. Falling back to default
2026-04-06T04:15:09.793078+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:15:06 INFO device.py L1468: 'peak_ram': 34.22GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:15:21 INFO device.py L1468: 'peak_ram': 34.22GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:15:35 INFO device.py L1468: 'peak_ram': 34.54GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:15:53 INFO device.py L1468: 'peak_ram': 34.54GB, 'peak_vram': 1.59GB
2026-04-06T04:16:09.879444+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:16:08 INFO device.py L1468: 'peak_ram': 34.54GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:16:29 INFO device.py L1468: 'peak_ram': 34.54GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:16:44 INFO device.py L1468: 'peak_ram': 34.54GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:16:57 INFO device.py L1468: 'peak_ram': 34.54GB, 'peak_vram': 1.59GB
2026-04-06T04:17:09.965021+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:17:12 INFO device.py L1468: 'peak_ram': 34.54GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:17:23 INFO device.py L1468: 'peak_ram': 34.81GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:17:39 INFO device.py L1468: 'peak_ram': 34.81GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:17:51 INFO device.py L1468: 'peak_ram': 34.81GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:18:02 INFO device.py L1468: 'peak_ram': 34.81GB, 'peak_vram': 1.59GB
2026-04-06T04:18:10.056599+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:18:17 INFO device.py L1468: 'peak_ram': 41.21GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:18:29 INFO device.py L1468: 'peak_ram': 41.21GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:18:40 INFO device.py L1468: 'peak_ram': 41.21GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:18:56 INFO device.py L1468: 'peak_ram': 41.21GB, 'peak_vram': 1.59GB
2026-04-06T04:19:10.145153+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:19:08 INFO device.py L1468: 'peak_ram': 42.57GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:19:23 INFO device.py L1468: 'peak_ram': 42.57GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:19:34 INFO device.py L1468: 'peak_ram': 42.57GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:19:44 INFO device.py L1468: 'peak_ram': 42.57GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:19:59 INFO device.py L1468: 'peak_ram': 42.57GB, 'peak_vram': 1.59GB
2026-04-06T04:20:10.240505+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:20:11 INFO device.py L1468: 'peak_ram': 42.66GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:20:29 INFO device.py L1468: 'peak_ram': 42.66GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:20:41 INFO device.py L1468: 'peak_ram': 42.66GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:20:52 INFO device.py L1468: 'peak_ram': 42.66GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:21:08 INFO device.py L1468: 'peak_ram': 42.66GB, 'peak_vram': 1.59GB
2026-04-06T04:21:10.338142+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:21:19 INFO device.py L1468: 'peak_ram': 42.92GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:21:35 INFO device.py L1468: 'peak_ram': 42.92GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:21:54 INFO device.py L1468: 'peak_ram': 44.11GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:22:04 INFO device.py L1468: 'peak_ram': 44.11GB, 'peak_vram': 1.59GB
2026-04-06T04:22:10.432054+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:22:12 INFO device.py L1468: 'peak_ram': 44.11GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:22:17 INFO device.py L1468: 'peak_ram': 44.11GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:22:21 INFO device.py L1468: 'peak_ram': 44.11GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:22:30 INFO device.py L1468: 'peak_ram': 44.2GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:22:36 INFO device.py L1468: 'peak_ram': 44.2GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:22:43 INFO device.py L1468: 'peak_ram': 45.44GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:22:53 INFO device.py L1468: 'peak_ram': 45.44GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:23:02 INFO device.py L1468: 'peak_ram': 45.44GB, 'peak_vram': 1.59GB
2026-04-06T04:23:10.537236+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:23:18 INFO device.py L1468: 'peak_ram': 45.44GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:23:29 INFO device.py L1468: 'peak_ram': 45.44GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:23:41 INFO device.py L1468: 'peak_ram': 45.44GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:23:58 INFO device.py L1468: 'peak_ram': 45.44GB, 'peak_vram': 1.59GB
2026-04-06T04:24:10.630868+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:24:10 INFO device.py L1468: 'peak_ram': 50.06GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:24:18 INFO device.py L1468: 'peak_ram': 50.06GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:24:23 INFO device.py L1468: 'peak_ram': 51.5GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:24:32 INFO shard_writer.py L208: model has been saved to /tmp/model_output/
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:24:33 WARNING export.py L336: /tmp/model_output already exists, this may cause model conflict
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: 2026-04-06 04:24:33 INFO device.py L1468: 'peak_ram': 51.5GB, 'peak_vram': 1.59GB
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: Checking if ChaiML/pony-v3a-g47-lr1e5ep1r64b32-W4A16 already exists in ChaiML
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: Creating repo ChaiML/pony-v3a-g47-lr1e5ep1r64b32-W4A16 and uploading /tmp/model_output to it
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: ---------- 2026-04-06 04:24:34 (0:00:00) ----------
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: Files: hashed 6/45 (12.0M/197.4G) | pre-uploaded: 0/0 (0.0/197.4G) (+45 unsure) | committed: 0/45 (0.0/197.4G) | ignored: 0
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: Workers: hashing: 39 | get upload mode: 4 | pre-uploading: 0 | committing: 0 | waiting: 21
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: ---------------------------------------------------
2026-04-06T04:25:10.723993+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader:       
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: ---------- 2026-04-06 04:25:34 (0:01:00) ----------
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: Files: hashed 45/45 (197.4G/197.4G) | pre-uploaded: 15/40 (63.3G/197.4G) | committed: 0/45 (0.0/197.4G) | ignored: 0
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 25 | committing: 0 | waiting: 39
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: ---------------------------------------------------
2026-04-06T04:26:10.823747+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader:       
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: ---------- 2026-04-06 04:26:34 (0:02:00) ----------
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: Files: hashed 45/45 (197.4G/197.4G) | pre-uploaded: 40/40 (197.4G/197.4G) | committed: 0/45 (0.0/197.4G) | ignored: 0
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 0 | committing: 1 | waiting: 63
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: ---------------------------------------------------
2026-04-06T04:27:10.944342+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: Processed model ChaiML/pony-v3a-g47-lr1e5ep1r64b32 in 1811.338s
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: uploading /tmp/model_output to s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/quantization_config.json
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/generation_config.json
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/chat_template.jinja
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/tokenizer_config.json
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/config.json
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model.safetensors.index.json
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/tokenizer.json
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00038-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00038-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00037-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00037-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00036-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00036-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00018-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00018-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00003-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00003-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00035-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00035-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00023-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00023-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00024-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00024-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00008-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00008-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00030-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00030-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00004-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00004-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00011-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00011-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00005-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00005-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00012-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00012-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00033-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00033-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00034-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00034-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00006-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00006-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00010-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00010-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00007-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00007-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00016-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00016-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00021-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00021-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00015-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00015-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00032-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00032-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00029-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00029-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00001-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00001-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00022-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00022-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00014-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00014-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00002-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00002-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00028-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00028-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00020-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00020-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00019-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00019-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00013-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00013-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00026-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00026-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00027-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00027-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00031-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00031-of-00038.safetensors
chaiml-pony-v3a-g47-lr1-18022-v1-uploader: cp /tmp/model_output/model-00025-of-00038.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00025-of-00038.safetensors
Job chaiml-pony-v3a-g47-lr1-18022-v1-uploader completed after 1921.56s with status: succeeded
Stopping job with name chaiml-pony-v3a-g47-lr1-18022-v1-uploader
Pipeline stage VLLMUploader completed in 1922.03s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMTemplater
2026-04-06T04:28:11.042024+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
Pipeline stage VLLMTemplater completed in 3.83s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v3a-g47-lr1-18022-v1
Waiting for inference service chaiml-pony-v3a-g47-lr1-18022-v1 to be ready
2026-04-06T04:29:11.139422+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:30:11.246087+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:31:11.348168+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:32:11.449854+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:33:11.554005+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:34:12.254895+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:35:12.362706+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:36:12.469433+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:37:12.592482+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:38:12.703732+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:39:12.848003+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:40:12.958909+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:41:13.067499+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:42:13.181887+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:43:13.293553+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:44:13.399169+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:45:13.508383+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:46:13.624352+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:47:13.738908+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:48:13.849980+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:49:13.955307+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:50:14.119604+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:51:14.226251+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:52:14.331831+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:53:14.438810+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:54:14.539289+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:55:14.645509+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
Failed to get request counts for guanaco-submitter. Falling back to default
2026-04-06T04:56:14.794356+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:57:14.907021+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:58:15.014192+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T04:59:15.124100+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T05:00:15.271884+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T05:01:15.382035+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T05:02:15.527113+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T05:03:15.670086+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T05:04:15.786466+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T05:05:15.892484+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T05:06:16.043495+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T05:07:16.191000+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
2026-04-06T05:08:16.711757+00:00 monitor updated for chaiml-pony-v3a-g47-lr1_18022_v1
Tearing down inference service chaiml-pony-v3a-g47-lr1-18022-v1
clean up pipeline due to error=DeploymentError('Timeout to start the InferenceService chaiml-pony-v3a-g47-lr1-18022-v1. The InferenceService is as following: {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'kind\': \'InferenceService\', \'metadata\': {\'annotations\': {\'autoscaling.knative.dev/class\': \'hpa.autoscaling.knative.dev\', \'autoscaling.knative.dev/container-concurrency-target-percentage\': \'70\', \'autoscaling.knative.dev/initial-scale\': \'5\', \'autoscaling.knative.dev/max-scale-down-rate\': \'1.1\', \'autoscaling.knative.dev/max-scale-up-rate\': \'2\', \'autoscaling.knative.dev/metric\': \'mean_pod_latency_ms_v2\', \'autoscaling.knative.dev/panic-threshold-percentage\': \'650\', \'autoscaling.knative.dev/panic-window-percentage\': \'35\', \'autoscaling.knative.dev/scale-down-delay\': \'30s\', \'autoscaling.knative.dev/scale-to-zero-grace-period\': \'10m\', \'autoscaling.knative.dev/stable-window\': \'180s\', \'autoscaling.knative.dev/target\': \'4000\', \'autoscaling.knative.dev/target-burst-capacity\': \'-1\', \'autoscaling.knative.dev/tick-interval\': \'15s\', \'features.knative.dev/http-full-duplex\': \'Enabled\', \'networking.knative.dev/ingress-class\': \'istio.ingress.networking.knative.dev\', \'serving.knative.dev/progress-deadline\': \'40m\'}, \'creationTimestamp\': \'2026-04-06T04:28:14Z\', \'finalizers\': [\'inferenceservice.finalizers\'], \'generation\': 1, \'labels\': {\'istio.io/rev\': \'prod-canary\', \'knative.coreweave.cloud/ingress\': \'istio.ingress.networking.knative.dev\', \'prometheus.k.chaiverse.com\': \'true\', \'qos.coreweave.cloud/latency\': \'low\'}, \'managedFields\': [{\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:metadata\': {\'f:annotations\': {\'.\': {}, \'f:autoscaling.knative.dev/class\': {}, \'f:autoscaling.knative.dev/container-concurrency-target-percentage\': {}, \'f:autoscaling.knative.dev/initial-scale\': {}, \'f:autoscaling.knative.dev/max-scale-down-rate\': {}, \'f:autoscaling.knative.dev/max-scale-up-rate\': {}, \'f:autoscaling.knative.dev/metric\': {}, \'f:autoscaling.knative.dev/panic-threshold-percentage\': {}, \'f:autoscaling.knative.dev/panic-window-percentage\': {}, \'f:autoscaling.knative.dev/scale-down-delay\': {}, \'f:autoscaling.knative.dev/scale-to-zero-grace-period\': {}, \'f:autoscaling.knative.dev/stable-window\': {}, \'f:autoscaling.knative.dev/target\': {}, \'f:autoscaling.knative.dev/target-burst-capacity\': {}, \'f:autoscaling.knative.dev/tick-interval\': {}, \'f:features.knative.dev/http-full-duplex\': {}, \'f:networking.knative.dev/ingress-class\': {}, \'f:serving.knative.dev/progress-deadline\': {}}, \'f:labels\': {\'.\': {}, \'f:istio.io/rev\': {}, \'f:knative.coreweave.cloud/ingress\': {}, \'f:prometheus.k.chaiverse.com\': {}, \'f:qos.coreweave.cloud/latency\': {}}}, \'f:spec\': {\'.\': {}, \'f:predictor\': {\'.\': {}, \'f:affinity\': {\'.\': {}, \'f:nodeAffinity\': {\'.\': {}, \'f:tion\': {}, \'f:requiredDuringSchedulingIgnoredDuringExecution\': {}}, \'f:podAffinity\': {\'.\': {}, \'f:tion\': {}}}, \'f:containerConcurrency\': {}, \'f:containers\': {}, \'f:imagePullSecrets\': {}, \'f:maxReplicas\': {}, \'f:minReplicas\': {}, \'f:priorityClassName\': {}, \'f:timeout\': {}, \'f:volumes\': {}}}}, \'manager\': \'OpenAPI-Generator\', \'operation\': \'Update\', \'time\': \'2026-04-06T04:28:14Z\'}, {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:metadata\': {\'f:finalizers\': {\'.\': {}, \'v:"inferenceservice.finalizers"\': {}}}}, \'manager\': \'manager\', \'operation\': \'Update\', \'time\': \'2026-04-06T04:28:14Z\'}, {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:status\': {\'.\': {}, \'f:components\': {\'.\': {}, \'f:predictor\': {\'.\': {}, \'f:latestCreatedRevision\': {}}}, \'f:conditions\': {}, \'f:modelStatus\': {\'.\': {}, \'f:states\': {\'.\': {}, \'f:activeModelState\': {}, \'f:targetModelState\': {}}, \'f:transitionStatus\': {}}, \'f:observedGeneration\': {}}}, \'manager\': \'manager\', \'operation\': \'Update\', \'subresource\': \'status\', \'time\': \'2026-04-06T05:05:25Z\'}], \'name\': \'chaiml-pony-v3a-g47-lr1-18022-v1\', \'namespace\': \'tenant-chaiml-guanaco\', \'resourceVersion\': \'1325598176\', \'uid\': \'a653e0cb-74d5-4e79-969c-0fb3a0f2cec9\'}, \'spec\': {\'predictor\': {\'affinity\': {\'nodeAffinity\': {\'tion\': [{\'preference\': {\'matchExpressions\': [{\'key\': \'gpu.nvidia.com/class\', \'operator\': \'In\', \'values\': [\'A100_NVLINK_80GB\']}]}, \'weight\': 5}], \'requiredDuringSchedulingIgnoredDuringExecution\': {\'nodeSelectorTerms\': [{\'matchExpressions\': [{\'key\': \'gpu.nvidia.com/class\', \'operator\': \'In\', \'values\': [\'A100_NVLINK_80GB\']}]}]}}, \'podAffinity\': {\'tion\': [{\'podAffinityTerm\': {\'labelSelector\': {\'matchLabels\': {\'serving.kserve.io/inferenceservice\': \'chaiml-pony-v3a-g47-lr1-18022-v1\'}}, \'topologyKey\': \'kubernetes.io/hostname\'}, \'weight\': 100}]}}, \'containerConcurrency\': 0, \'containers\': [{\'args\': [\'serve\', \'s3://guanaco-vllm-models/chaiml-pony-v3a-g47-lr1-18022-v1/default\', \'--port\', \'8080\', \'--tensor-parallel-size\', \'8\', \'--gpu-memory-utilization\', \'0.9\', \'--max-model-len\', \'8192\', \'--max-num-batched-tokens\', \'8192\', \'--max-num-seqs\', \'64\', \'--trust-remote-code\', \'--load-format\', \'runai_streamer\', \'--served-model-name\', \'ChaiML/pony-v3a-g47-lr1e5ep1r64b32\', \'--model-loader-extra-config\', \'{"distributed": true, "concurrency": 2}\'], \'env\': [{\'name\': \'RESERVE_MEMORY\', \'value\': \'2048\'}, {\'name\': \'DOWNLOAD_TO_LOCAL\', \'value\': \'/dev/shm/model_cache\'}, {\'name\': \'NUM_GPUS\', \'value\': \'8\'}, {\'name\': \'VLLM_ASSETS_CACHE\', \'value\': \'/code/vllm_assets_cache\'}, {\'name\': \'RUNAI_STREAMER_S3_USE_VIRTUAL_ADDRESSING\', \'value\': \'1\'}, {\'name\': \'RUNAI_STREAMER_CONCURRENCY\', \'value\': \'1\'}, {\'name\': \'AWS_EC2_METADATA_DISABLED\', \'value\': \'true\'}, {\'name\': \'AWS_ACCESS_KEY_ID\', \'value\': \'CWZAGMHZXKZRFGJK\'}, {\'name\': \'AWS_SECRET_ACCESS_KEY\', \'value\': \'cwoAeWzp46q4O0sTNXOEuZ1MvZzKEFlS9DtEhnTldKp\'}, {\'name\': \'AWS_ENDPOINT_URL\', \'value\': \'https://cwobject.com\'}, {\'name\': \'HF_TOKEN\', \'valueFrom\': {\'secretKeyRef\': {\'key\': \'token\', \'name\': \'hf-token\'}}}, {\'name\': \'RUNAI_STREAMER_CONCURRENCY\', \'value\': \'1\'}], \'image\': \'gcr.io/chai-959f8/vllm:v0.17.1.transformers-5.3.0-dsa_patch\', \'imagePullPolicy\': \'IfNotPresent\', \'name\': \'kserve-container\', \'readinessProbe\': {\'failureThreshold\': 1, \'httpGet\': {\'path\': \'/v1/models\', \'port\': 8080}, \'initialDelaySeconds\': 60, \'periodSeconds\': 10, \'successThreshold\': 1, \'timeoutSeconds\': 5}, \'resources\': {\'limits\': {\'cpu\': \'16\', \'memory\': \'229Gi\', \'nvidia.com/gpu\': \'8\'}, \'requests\': {\'cpu\': \'16\', \'memory\': \'229Gi\', \'nvidia.com/gpu\': \'8\'}}, \'volumeMounts\': [{\'mountPath\': \'/dev/shm\', \'name\': \'shared-memory-cache\'}]}], \'imagePullSecrets\': [{\'name\': \'docker-creds\'}], \'maxReplicas\': 5, \'minReplicas\': 0, \'priorityClassName\': \'chaiverse\', \'timeout\': 20, \'volumes\': [{\'emptyDir\': {\'medium\': \'Memory\', \'sizeLimit\': \'229Gi\'}, \'name\': \'shared-memory-cache\'}]}}, \'status\': {\'components\': {\'predictor\': {\'latestCreatedRevision\': \'chaiml-pony-v3a-g47-lr1-18022-v1-predictor-00001\'}}, \'conditions\': [{\'lastTransitionTime\': \'2026-04-06T04:28:15Z\', \'reason\': \'PredictorConfigurationReady not ready\', \'severity\': \'Info\', \'status\': \'False\', \'type\': \'LatestDeploymentReady\'}, {\'lastTransitionTime\': \'2026-04-06T05:05:25Z\', \'message\': \'Revision "chaiml-pony-v3a-g47-lr1-18022-v1-predictor-00001" failed with message: Container failed with: m.py", line 154, in __init__\\n(APIServer pid=1) self.engine_core = EngineCoreClient.make_async_mp_client(\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper\\n(APIServer pid=1) return func(*args, **kwargs)\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 127, in make_async_mp_client\\n(APIServer pid=1) return AsyncMPClient(*client_args)\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper\\n(APIServer pid=1) return func(*args, **kwargs)\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 911, in __init__\\n(APIServer pid=1) super().__init__(\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 569, in __init__\\n(APIServer pid=1) with launch_core_engines(\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/lib/python3.12/contextlib.py", line 144, in __exit__\\n(APIServer pid=1) next(self.gen)\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 951, in launch_core_engines\\n(APIServer pid=1) wait_for_engine_startup(\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 1010, in wait_for_engine_startup\\n(APIServer pid=1) raise RuntimeError(\\n(APIServer pid=1) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}\\n/usr/lib/python3.12/multiprocessing/resource_tracker.py:279: UserWarning: resource_tracker: There appear to be 9 leaked shared_memory objects to clean up at shutdown\\n warnings.warn(\\\'resource_tracker: There appear to be %!!(MISSING)d(MISSING) \\\'\\n.\', \'reason\': \'RevisionFailed\', \'severity\': \'Info\', \'status\': \'False\', \'type\': \'PredictorConfigurationReady\'}, {\'lastTransitionTime\': \'2026-04-06T04:28:16Z\', \'message\': \'Configuration "chaiml-pony-v3a-g47-lr1-18022-v1-predictor" does not have any ready Revision.\', \'reason\': \'RevisionMissing\', \'status\': \'False\', \'type\': \'PredictorReady\'}, {\'lastTransitionTime\': \'2026-04-06T04:28:16Z\', \'message\': \'Configuration "chaiml-pony-v3a-g47-lr1-18022-v1-predictor" does not have any ready Revision.\', \'reason\': \'RevisionMissing\', \'severity\': \'Info\', \'status\': \'False\', \'type\': \'PredictorRouteReady\'}, {\'lastTransitionTime\': \'2026-04-06T04:28:16Z\', \'message\': \'Configuration "chaiml-pony-v3a-g47-lr1-18022-v1-predictor" does not have any ready Revision.\', \'reason\': \'RevisionMissing\', \'status\': \'False\', \'type\': \'Ready\'}, {\'lastTransitionTime\': \'2026-04-06T04:28:16Z\', \'reason\': \'PredictorRouteReady not ready\', \'severity\': \'Info\', \'status\': \'False\', \'type\': \'RoutesReady\'}], \'modelStatus\': {\'states\': {\'activeModelState\': \'\', \'targetModelState\': \'Pending\'}, \'transitionStatus\': \'InProgress\'}, \'observedGeneration\': 1}}')
run pipeline stage %s
Running pipeline stage VLLMDeleter
Checking if service chaiml-pony-v3a-g47-lr1-18022-v1 is running
Skipping teardown as no inference service was found
Pipeline stage VLLMDeleter completed in 0.46s
run pipeline stage %s
Running pipeline stage VLLMModelDeleter
Cleaning model data from S3
Cleaning model data from model cache
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/chat_template.jinja from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/config.json from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/generation_config.json from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00001-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00002-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00003-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00004-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00005-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00006-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00007-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00008-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00009-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00010-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00011-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00012-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00013-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00014-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00015-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00016-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00017-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00018-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00019-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00020-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00021-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00022-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00023-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00024-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00025-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00026-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00027-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00028-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00029-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00030-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00031-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00032-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00033-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00034-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00035-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00036-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00037-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model-00038-of-00038.safetensors from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/model.safetensors.index.json from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/quantization_config.json from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/tokenizer.json from bucket guanaco-vllm-models
Deleting key chaiml-pony-v3a-g47-lr1-18022-v1/default/tokenizer_config.json from bucket guanaco-vllm-models
Pipeline stage VLLMModelDeleter completed in 20.32s
Shutdown handler de-registered
chaiml-pony-v3a-g47-lr1_18022_v1 status is now failed due to DeploymentManager action
chaiml-pony-v3a-g47-lr1_18022_v1 status is now torndown due to DeploymentManager action
chaiml-pony-v3a-g47-lr1_18022_v1 status is now deployed due to DeploymentManager action
chaiml-pony-v3a-g47-lr1_18022_v1 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-v3a-g47-lr1_18022_v1 status is now torndown due to DeploymentManager action