Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-98p-2ff-chaiml-m-56461-v3-uploader
Waiting for job on chaiml-98p-2ff-chaiml-m-56461-v3-uploader to finish
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: Using quantization_mode: fp8
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: Checking if ChaiML/98p_2ff_chaiml_mistral_24b_2048_90555_v2_cp1248_v2_merged-FP8 already exists in ChaiML
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: Downloading snapshot of ChaiML/98p_2ff_chaiml_mistral_24b_2048_90555_v2_cp1248_v2_merged-FP8...
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: Downloaded in 12.691s
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: Processed model ChaiML/98p_2ff_chaiml_mistral_24b_2048_90555_v2_cp1248_v2_merged in 16.133s
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: creating bucket guanaco-vllm-models
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-56461-v3/default
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-56461-v3/default/generation_config.json
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-56461-v3/default/.gitattributes
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-56461-v3/default/tokenizer_config.json
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-56461-v3/default/recipe.yaml
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-56461-v3/default/special_tokens_map.json
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-56461-v3/default/tokenizer.json
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-56461-v3/default/config.json
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-56461-v3/default/model.safetensors.index.json
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: cp /dev/shm/model_output/model-00006-of-00006.safetensors s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-56461-v3/default/model-00006-of-00006.safetensors
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: cp /dev/shm/model_output/model-00001-of-00006.safetensors s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-56461-v3/default/model-00001-of-00006.safetensors
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: cp /dev/shm/model_output/model-00005-of-00006.safetensors s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-56461-v3/default/model-00005-of-00006.safetensors
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: cp /dev/shm/model_output/model-00002-of-00006.safetensors s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-56461-v3/default/model-00002-of-00006.safetensors
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: cp /dev/shm/model_output/model-00004-of-00006.safetensors s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-56461-v3/default/model-00004-of-00006.safetensors
chaiml-98p-2ff-chaiml-m-56461-v3-uploader: cp /dev/shm/model_output/model-00003-of-00006.safetensors s3://guanaco-vllm-models/chaiml-98p-2ff-chaiml-m-56461-v3/default/model-00003-of-00006.safetensors
Job chaiml-98p-2ff-chaiml-m-56461-v3-uploader completed after 73.4s with status: succeeded
Stopping job with name chaiml-98p-2ff-chaiml-m-56461-v3-uploader
Pipeline stage VLLMUploader completed in 73.91s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-98p-2ff-chaiml-m-56461-v3
Waiting for inference service chaiml-98p-2ff-chaiml-m-56461-v3 to be ready
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Inference service chaiml-98p-2ff-chaiml-m-56461-v3 ready after 654.3805885314941s
Pipeline stage VLLMDeployer completed in 654.96s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.414701223373413s
Received healthy response to inference request in 1.1750342845916748s
Received healthy response to inference request in 1.2546987533569336s
Received healthy response to inference request in 1.3232131004333496s
Received healthy response to inference request in 1.1470487117767334s
Received healthy response to inference request in 1.255333662033081s
Received healthy response to inference request in 1.2701516151428223s
Received healthy response to inference request in 1.7184114456176758s
Received healthy response to inference request in 1.2220351696014404s
Received healthy response to inference request in 1.2414898872375488s
Received healthy response to inference request in 1.2046515941619873s
Received healthy response to inference request in 1.235447883605957s
Received healthy response to inference request in 1.1751537322998047s
Received healthy response to inference request in 1.1771185398101807s
Received healthy response to inference request in 1.1400976181030273s
Received healthy response to inference request in 1.1908109188079834s
Received healthy response to inference request in 1.2188546657562256s
Received healthy response to inference request in 1.2913241386413574s
Received healthy response to inference request in 1.4371254444122314s
Received healthy response to inference request in 1.1590473651885986s
Received healthy response to inference request in 1.1560122966766357s
Received healthy response to inference request in 1.2303900718688965s
Received healthy response to inference request in 1.286949634552002s
Received healthy response to inference request in 1.2770345211029053s
Received healthy response to inference request in 1.5132029056549072s
Received healthy response to inference request in 1.220824956893921s
Received healthy response to inference request in 1.3228800296783447s
Received healthy response to inference request in 1.3196892738342285s
Received healthy response to inference request in 1.2438888549804688s
Received healthy response to inference request in 1.2219464778900146s
30 requests
0 failed requests
5th percentile: 1.1510823249816895
10th percentile: 1.1587438583374023
20th percentile: 1.1767255783081054
30th percentile: 1.2145937442779542
40th percentile: 1.2219996929168702
50th percentile: 1.238468885421753
60th percentile: 1.2549527168273926
70th percentile: 1.2800090551376342
80th percentile: 1.3203274250030517
90th percentile: 1.416943645477295
95th percentile: 1.4789680480957028
99th percentile: 1.658900969028473
mean time: 1.2681522925694784
Pipeline stage StressChecker completed in 41.98s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.67s
Shutdown handler de-registered
chaiml-98p-2ff-chaiml-m_56461_v3 status is now deployed due to DeploymentManager action
chaiml-98p-2ff-chaiml-m_56461_v3 status is now inactive due to auto deactivation removed underperforming models