Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
run pipeline stage %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Running pipeline stage VLLMUploader
Starting job with name evelyn777-chai-sft-3b-v3-v1-uploader
Waiting for job on evelyn777-chai-sft-3b-v3-v1-uploader to finish
evelyn777-chai-sft-3b-v3-v1-uploader: Using quantization_mode: none
evelyn777-chai-sft-3b-v3-v1-uploader: Downloading snapshot of evelyn777/chai-sft-3b-v3...
evelyn777-chai-sft-3b-v3-v1-uploader:
Fetching 13 files: 0%| | 0/13 [00:00<?, ?it/s]
Fetching 13 files: 8%|▊ | 1/13 [00:00<00:04, 2.98it/s]
Fetching 13 files: 46%|████▌ | 6/13 [00:00<00:00, 14.97it/s]
Fetching 13 files: 69%|██████▉ | 9/13 [00:04<00:02, 1.51it/s]
Fetching 13 files: 100%|██████████| 13/13 [00:04<00:00, 2.69it/s]
evelyn777-chai-sft-3b-v3-v1-uploader: Downloaded in 4.967s
evelyn777-chai-sft-3b-v3-v1-uploader: Processed model evelyn777/chai-sft-3b-v3 in 7.482s
evelyn777-chai-sft-3b-v3-v1-uploader: creating bucket guanaco-vllm-models
evelyn777-chai-sft-3b-v3-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
evelyn777-chai-sft-3b-v3-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
evelyn777-chai-sft-3b-v3-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
evelyn777-chai-sft-3b-v3-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
evelyn777-chai-sft-3b-v3-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
evelyn777-chai-sft-3b-v3-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
evelyn777-chai-sft-3b-v3-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
evelyn777-chai-sft-3b-v3-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
evelyn777-chai-sft-3b-v3-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
evelyn777-chai-sft-3b-v3-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
evelyn777-chai-sft-3b-v3-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
evelyn777-chai-sft-3b-v3-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
evelyn777-chai-sft-3b-v3-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
evelyn777-chai-sft-3b-v3-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
evelyn777-chai-sft-3b-v3-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
evelyn777-chai-sft-3b-v3-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
evelyn777-chai-sft-3b-v3-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
evelyn777-chai-sft-3b-v3-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v3-v1
evelyn777-chai-sft-3b-v3-v1-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v3-v1/.gitattributes
evelyn777-chai-sft-3b-v3-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v3-v1/config.json
evelyn777-chai-sft-3b-v3-v1-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v3-v1/generation_config.json
evelyn777-chai-sft-3b-v3-v1-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v3-v1/model.safetensors.index.json
evelyn777-chai-sft-3b-v3-v1-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v3-v1/merges.txt
evelyn777-chai-sft-3b-v3-v1-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v3-v1/vocab.json
evelyn777-chai-sft-3b-v3-v1-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v3-v1/added_tokens.json
evelyn777-chai-sft-3b-v3-v1-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v3-v1/chat_template.jinja
evelyn777-chai-sft-3b-v3-v1-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v3-v1/special_tokens_map.json
evelyn777-chai-sft-3b-v3-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v3-v1/tokenizer_config.json
evelyn777-chai-sft-3b-v3-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v3-v1/tokenizer.json
evelyn777-chai-sft-3b-v3-v1-uploader: cp /dev/shm/model_output/model-00002-of-00002.safetensors s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v3-v1/model-00002-of-00002.safetensors
evelyn777-chai-sft-3b-v3-v1-uploader: cp /dev/shm/model_output/model-00001-of-00002.safetensors s3://guanaco-vllm-models/evelyn777-chai-sft-3b-v3-v1/model-00001-of-00002.safetensors
Job evelyn777-chai-sft-3b-v3-v1-uploader completed after 83.39s with status: succeeded
Stopping job with name evelyn777-chai-sft-3b-v3-v1-uploader
Pipeline stage VLLMUploader completed in 84.07s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service evelyn777-chai-sft-3b-v3-v1
Waiting for inference service evelyn777-chai-sft-3b-v3-v1 to be ready
HTTP Request: %s %s "%s %d %s"
Inference service evelyn777-chai-sft-3b-v3-v1 ready after 170.72489404678345s
Pipeline stage VLLMDeployer completed in 171.24s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 0.742628812789917s
Received healthy response to inference request in 0.6312921047210693s
Received healthy response to inference request in 0.9274861812591553s
Received healthy response to inference request in 0.9166789054870605s
Received healthy response to inference request in 0.7227349281311035s
Received healthy response to inference request in 0.880274772644043s
Received healthy response to inference request in 0.6782958507537842s
Received healthy response to inference request in 1.0423674583435059s
Received healthy response to inference request in 0.680474042892456s
Received healthy response to inference request in 0.9343841075897217s
Received healthy response to inference request in 0.8595371246337891s
Received healthy response to inference request in 0.7741615772247314s
Received healthy response to inference request in 0.7248513698577881s
Received healthy response to inference request in 0.7774312496185303s
Received healthy response to inference request in 0.974301815032959s
Received healthy response to inference request in 0.7250728607177734s
Received healthy response to inference request in 0.8126671314239502s
Received healthy response to inference request in 0.5421137809753418s
Received healthy response to inference request in 0.7672264575958252s
Received healthy response to inference request in 0.6419064998626709s
Received healthy response to inference request in 0.8613858222961426s
Received healthy response to inference request in 0.621509313583374s
Received healthy response to inference request in 0.8125853538513184s
Received healthy response to inference request in 0.6998491287231445s
Received healthy response to inference request in 0.7660861015319824s
Received healthy response to inference request in 0.715524435043335s
Received healthy response to inference request in 0.692631721496582s
Received healthy response to inference request in 0.8340363502502441s
Received healthy response to inference request in 0.789459228515625s
Received healthy response to inference request in 0.7808940410614014s
30 requests
0 failed requests
5th percentile: 0.6259115695953369
10th percentile: 0.6408450603485107
20th percentile: 0.6902001857757568
30th percentile: 0.720571780204773
40th percentile: 0.7356064319610596
50th percentile: 0.7706940174102783
60th percentile: 0.7843201160430908
70th percentile: 0.8190778970718383
80th percentile: 0.8651636123657227
90th percentile: 0.928175973892212
95th percentile: 0.9563388466835021
99th percentile: 1.0226284217834474
mean time: 0.7776616175969442
Pipeline stage StressChecker completed in 26.50s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.75s
Shutdown handler de-registered
evelyn777-chai-sft-3b-v3_v1 status is now deployed due to DeploymentManager action
evelyn777-chai-sft-3b-v3_v1 status is now inactive due to auto deactivation removed underperforming models