Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-muster-v3b-kakit-69132-v3-uploader
Waiting for job on chaiml-muster-v3b-kakit-69132-v3-uploader to finish
chaiml-muster-v3b-kakit-69132-v3-uploader: Using quantization_mode: w4a16
chaiml-muster-v3b-kakit-69132-v3-uploader: Checking if ChaiML/muster-v3b-kakit-q235b-lr1e4ep1r64g4-W4A16 already exists in ChaiML
chaiml-muster-v3b-kakit-69132-v3-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-muster-v3b-kakit-69132-v3-uploader: Downloading snapshot of ChaiML/muster-v3b-kakit-q235b-lr1e4ep1r64g4-W4A16...
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
chaiml-muster-v3b-kakit-69132-v3-uploader: Downloaded in 51.386s
chaiml-muster-v3b-kakit-69132-v3-uploader: Processed model ChaiML/muster-v3b-kakit-q235b-lr1e4ep1r64g4 in 52.041s
chaiml-muster-v3b-kakit-69132-v3-uploader: creating bucket guanaco-vllm-models
chaiml-muster-v3b-kakit-69132-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v3b-kakit-69132-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-muster-v3b-kakit-69132-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-muster-v3b-kakit-69132-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-muster-v3b-kakit-69132-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v3b-kakit-69132-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-muster-v3b-kakit-69132-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v3b-kakit-69132-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-muster-v3b-kakit-69132-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v3b-kakit-69132-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-muster-v3b-kakit-69132-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v3b-kakit-69132-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-muster-v3b-kakit-69132-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-muster-v3b-kakit-69132-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-muster-v3b-kakit-69132-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-muster-v3b-kakit-69132-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/added_tokens.json
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/config.json
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/quantization_config.json
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/merges.txt
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/vocab.json
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model.safetensors.index.json
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/tokenizer.json
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00027-of-00027.safetensors
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00001-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00001-of-00027.safetensors
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00002-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00002-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00016-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00016-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00020-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00020-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00006-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00006-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00017-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00017-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00022-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00008-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00008-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00011-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00011-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00015-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00015-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00025-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00025-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00007-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00014-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00014-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00019-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00019-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00024-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00024-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00013-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00013-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00004-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00004-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00009-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00009-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00005-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00005-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00010-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00023-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00023-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00021-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00021-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00018-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00012-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00012-of-00027.safetensors
chaiml-muster-v3b-kakit-69132-v3-uploader: cp /dev/shm/model_output/model-00003-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v3b-kakit-69132-v3/default/model-00003-of-00027.safetensors
Job chaiml-muster-v3b-kakit-69132-v3-uploader completed after 659.17s with status: succeeded
Stopping job with name chaiml-muster-v3b-kakit-69132-v3-uploader
Pipeline stage VLLMUploader completed in 659.57s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.50s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-muster-v3b-kakit-69132-v3
Waiting for inference service chaiml-muster-v3b-kakit-69132-v3 to be ready
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Inference service chaiml-muster-v3b-kakit-69132-v3 ready after 900.5479514598846s
Pipeline stage VLLMDeployer completed in 900.95s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.6697657108306885s
Received healthy response to inference request in 2.0982651710510254s
Received healthy response to inference request in 2.366145372390747s
Received healthy response to inference request in 1.944347620010376s
Received healthy response to inference request in 2.324821710586548s
Received healthy response to inference request in 1.8902256488800049s
Received healthy response to inference request in 2.148775100708008s
Received healthy response to inference request in 1.895397424697876s
Received healthy response to inference request in 1.857076644897461s
Received healthy response to inference request in 2.4229767322540283s
Received healthy response to inference request in 2.258497953414917s
Received healthy response to inference request in 1.9911370277404785s
Received healthy response to inference request in 1.90169358253479s
Received healthy response to inference request in 2.1056110858917236s
Received healthy response to inference request in 1.9984087944030762s
Received healthy response to inference request in 2.2929189205169678s
Received healthy response to inference request in 2.1964309215545654s
Received healthy response to inference request in 2.1050071716308594s
Received healthy response to inference request in 2.0931637287139893s
Received healthy response to inference request in 2.4895451068878174s
Received healthy response to inference request in 1.9827518463134766s
Received healthy response to inference request in 1.9900844097137451s
Received healthy response to inference request in 2.5831899642944336s
Received healthy response to inference request in 2.169158935546875s
Received healthy response to inference request in 1.969480276107788s
Received healthy response to inference request in 2.2068140506744385s
Received healthy response to inference request in 2.0375425815582275s
Received healthy response to inference request in 2.503772020339966s
Received healthy response to inference request in 2.589144468307495s
Received healthy response to inference request in 1.8492250442504883s
30 requests
0 failed requests
5th percentile: 1.8719936966896058
10th percentile: 1.8948802471160888
20th percentile: 1.9644537448883057
30th percentile: 1.9908212423324585
40th percentile: 2.070915269851685
50th percentile: 2.1053091287612915
60th percentile: 2.1800677299499513
70th percentile: 2.268824243545532
80th percentile: 2.3775116443634037
90th percentile: 2.5117138147354128
95th percentile: 2.5864649415016174
99th percentile: 2.6463855504989624
mean time: 2.1643791675567625
Pipeline stage StressChecker completed in 68.87s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.68s
Shutdown handler de-registered
chaiml-muster-v3b-kakit_69132_v3 status is now deployed due to DeploymentManager action
chaiml-muster-v3b-kakit_69132_v3 status is now inactive due to auto deactivation removed underperforming models