Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-reward-dpo-52f9-47866-v1-uploader
Waiting for job on chaiml-reward-dpo-52f9-47866-v1-uploader to finish
chaiml-reward-dpo-52f9-47866-v1-uploader: Using quantization_mode: w4a16
chaiml-reward-dpo-52f9-47866-v1-uploader: Repo ChaiML/reward-dpo-52f9-chaiml-235b-sft-prod-rm_38783_v1-W4A16 already ends in W4A16. Skipping...
chaiml-reward-dpo-52f9-47866-v1-uploader: Checking if ChaiML/reward-dpo-52f9-chaiml-235b-sft-prod-rm_38783_v1-W4A16 already exists in ChaiML
chaiml-reward-dpo-52f9-47866-v1-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-reward-dpo-52f9-47866-v1-uploader: Downloading snapshot of ChaiML/reward-dpo-52f9-chaiml-235b-sft-prod-rm_38783_v1-W4A16...
chaiml-reward-dpo-52f9-47866-v1-uploader: Downloaded in 54.295s
chaiml-reward-dpo-52f9-47866-v1-uploader: Processed model ChaiML/reward-dpo-52f9-chaiml-235b-sft-prod-rm_38783_v1-W4A16 in 54.836s
chaiml-reward-dpo-52f9-47866-v1-uploader: creating bucket guanaco-vllm-models
chaiml-reward-dpo-52f9-47866-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-reward-dpo-52f9-47866-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-reward-dpo-52f9-47866-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-reward-dpo-52f9-47866-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-reward-dpo-52f9-47866-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-reward-dpo-52f9-47866-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-reward-dpo-52f9-47866-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-reward-dpo-52f9-47866-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-reward-dpo-52f9-47866-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-reward-dpo-52f9-47866-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-reward-dpo-52f9-47866-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-reward-dpo-52f9-47866-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-reward-dpo-52f9-47866-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-reward-dpo-52f9-47866-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-reward-dpo-52f9-47866-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-reward-dpo-52f9-47866-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-reward-dpo-52f9-47866-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-reward-dpo-52f9-47866-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/added_tokens.json
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/.gitattributes
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/chat_template.jinja
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/tokenizer_config.json
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/special_tokens_map.json
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/generation_config.json
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/config.json
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/quantization_config.json
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/merges.txt
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model.safetensors.index.json
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/vocab.json
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/tokenizer.json
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00027-of-00027.safetensors
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00002-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00002-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00003-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00003-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00006-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00006-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00005-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00005-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00019-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00019-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00022-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00013-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00013-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00025-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00025-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00020-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00020-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00004-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00004-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00012-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00012-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00014-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00014-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00015-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00015-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00017-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00017-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00001-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00001-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00026-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00026-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00010-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00016-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00016-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00008-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00008-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00007-of-00027.safetensors
HTTP Request: %s %s "%s %d %s"
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00018-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00023-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00023-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00011-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00011-of-00027.safetensors
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00009-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00009-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00024-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00024-of-00027.safetensors
chaiml-reward-dpo-52f9-47866-v1-uploader: cp /dev/shm/model_output/model-00021-of-00027.safetensors s3://guanaco-vllm-models/chaiml-reward-dpo-52f9-47866-v1/default/model-00021-of-00027.safetensors
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Job chaiml-reward-dpo-52f9-47866-v1-uploader completed after 836.84s with status: succeeded
Stopping job with name chaiml-reward-dpo-52f9-47866-v1-uploader
Pipeline stage VLLMUploader completed in 837.38s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.16s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-reward-dpo-52f9-47866-v1
Waiting for inference service chaiml-reward-dpo-52f9-47866-v1 to be ready
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Inference service chaiml-reward-dpo-52f9-47866-v1 ready after 1006.0530643463135s
Pipeline stage VLLMDeployer completed in 1006.61s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.2286691665649414s
Received healthy response to inference request in 2.3566558361053467s
Received healthy response to inference request in 2.165090322494507s
Received healthy response to inference request in 2.0705795288085938s
Received healthy response to inference request in 2.090191125869751s
Received healthy response to inference request in 2.119544267654419s
Received healthy response to inference request in 2.071739435195923s
Received healthy response to inference request in 2.002694845199585s
Received healthy response to inference request in 2.3539133071899414s
Received healthy response to inference request in 2.10102915763855s
Received healthy response to inference request in 2.2087578773498535s
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Received healthy response to inference request in 2.31046199798584s
Received healthy response to inference request in 2.1295480728149414s
Received healthy response to inference request in 2.234154462814331s
Received healthy response to inference request in 1.9476335048675537s
Received healthy response to inference request in 1.998495101928711s
Received healthy response to inference request in 1.9578778743743896s
Received healthy response to inference request in 2.062425374984741s
Received healthy response to inference request in 2.084888219833374s
Received healthy response to inference request in 2.0176260471343994s
Received healthy response to inference request in 2.197904586791992s
Received healthy response to inference request in 1.9958040714263916s
Received healthy response to inference request in 2.1472015380859375s
Received healthy response to inference request in 2.3802413940429688s
Received healthy response to inference request in 2.043301582336426s
Received healthy response to inference request in 1.9758689403533936s
Received healthy response to inference request in 2.278724431991577s
Received healthy response to inference request in 2.233283758163452s
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Received healthy response to inference request in 2.0408406257629395s
Received healthy response to inference request in 2.0738818645477295s
30 requests
0 failed requests
5th percentile: 1.9659738540649414
10th percentile: 1.9938105583190917
20th percentile: 2.0146398067474367
30th percentile: 2.0566882371902464
40th percentile: 2.0730248928070067
50th percentile: 2.0956101417541504
60th percentile: 2.1366094589233398
70th percentile: 2.2011605739593505
80th percentile: 2.233457899093628
90th percentile: 2.31480712890625
95th percentile: 2.3554216980934144
99th percentile: 2.3734015822410583
mean time: 2.1293009440104167
Pipeline stage StressChecker completed in 67.66s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.63s
Shutdown handler de-registered
chaiml-reward-dpo-52f9-_47866_v1 status is now deployed due to DeploymentManager action
chaiml-reward-dpo-52f9-_47866_v1 status is now inactive due to auto deactivation removed underperforming models
chaiml-reward-dpo-52f9-_47866_v1 status is now torndown due to DeploymentManager action