Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-prm-kimi-v1-300-92220-v11-uploader
Waiting for job on chaiml-prm-kimi-v1-300-92220-v11-uploader to finish
chaiml-prm-kimi-v1-300-92220-v11-uploader: Using quantization_mode: none
chaiml-prm-kimi-v1-300-92220-v11-uploader: Downloading snapshot of ChaiML/prm_kimi_v1_300k_default8b-cosine-lr1e6g32...
chaiml-prm-kimi-v1-300-92220-v11-uploader: Downloaded in 8.645s
chaiml-prm-kimi-v1-300-92220-v11-uploader: Processed model ChaiML/prm_kimi_v1_300k_default8b-cosine-lr1e6g32 in 14.179s
chaiml-prm-kimi-v1-300-92220-v11-uploader: creating bucket guanaco-vllm-models
chaiml-prm-kimi-v1-300-92220-v11-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-prm-kimi-v1-300-92220-v11-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-prm-kimi-v1-300-92220-v11-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-prm-kimi-v1-300-92220-v11-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-prm-kimi-v1-300-92220-v11-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-prm-kimi-v1-300-92220-v11-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-prm-kimi-v1-300-92220-v11-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-prm-kimi-v1-300-92220-v11-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-prm-kimi-v1-300-92220-v11-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-prm-kimi-v1-300-92220-v11-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-prm-kimi-v1-300-92220-v11-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-prm-kimi-v1-300-92220-v11-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-prm-kimi-v1-300-92220-v11-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-prm-kimi-v1-300-92220-v11-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-prm-kimi-v1-300-92220-v11-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-prm-kimi-v1-300-92220-v11-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-prm-kimi-v1-300-92220-v11-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-prm-kimi-v1-300-92220-v11-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-prm-kimi-v1-300-92220-v11/default
chaiml-prm-kimi-v1-300-92220-v11-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-prm-kimi-v1-300-92220-v11/default/README.md
chaiml-prm-kimi-v1-300-92220-v11-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-prm-kimi-v1-300-92220-v11/default/.gitattributes
chaiml-prm-kimi-v1-300-92220-v11-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-prm-kimi-v1-300-92220-v11/default/model.safetensors.index.json
chaiml-prm-kimi-v1-300-92220-v11-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-prm-kimi-v1-300-92220-v11/default/config.json
chaiml-prm-kimi-v1-300-92220-v11-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-prm-kimi-v1-300-92220-v11/default/tokenizer_config.json
chaiml-prm-kimi-v1-300-92220-v11-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-prm-kimi-v1-300-92220-v11/default/special_tokens_map.json
chaiml-prm-kimi-v1-300-92220-v11-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-prm-kimi-v1-300-92220-v11/default/tokenizer.json
chaiml-prm-kimi-v1-300-92220-v11-uploader: cp /dev/shm/model_output/model-00004-of-00004.safetensors s3://guanaco-vllm-models/chaiml-prm-kimi-v1-300-92220-v11/default/model-00004-of-00004.safetensors
chaiml-prm-kimi-v1-300-92220-v11-uploader: cp /dev/shm/model_output/model-00001-of-00004.safetensors s3://guanaco-vllm-models/chaiml-prm-kimi-v1-300-92220-v11/default/model-00001-of-00004.safetensors
chaiml-prm-kimi-v1-300-92220-v11-uploader: cp /dev/shm/model_output/model-00003-of-00004.safetensors s3://guanaco-vllm-models/chaiml-prm-kimi-v1-300-92220-v11/default/model-00003-of-00004.safetensors
chaiml-prm-kimi-v1-300-92220-v11-uploader: cp /dev/shm/model_output/model-00002-of-00004.safetensors s3://guanaco-vllm-models/chaiml-prm-kimi-v1-300-92220-v11/default/model-00002-of-00004.safetensors
Job chaiml-prm-kimi-v1-300-92220-v11-uploader completed after 41.96s with status: succeeded
Stopping job with name chaiml-prm-kimi-v1-300-92220-v11-uploader
Pipeline stage VLLMUploader completed in 42.42s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.70s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-prm-kimi-v1-300-92220-v11
Waiting for inference service chaiml-prm-kimi-v1-300-92220-v11 to be ready
2026-03-25T18:53:29.296786+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T18:54:29.383705+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T18:55:29.469967+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T18:56:29.555684+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T18:57:29.643781+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T18:58:29.729986+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T18:59:29.818812+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:00:29.906094+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:01:29.993467+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:02:30.082663+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:03:30.199968+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:04:30.288810+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:05:30.374712+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:06:30.469461+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:07:30.556042+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:08:30.648801+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:09:30.737655+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
Failed to get response for submission qwen-qwen3-5-35b-a3b-fp8_v7: ('http://qwen-qwen3-5-35b-a3b-fp8-v7-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/completions', 'request timeout')
2026-03-25T19:10:30.827318+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:11:30.913998+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:12:31.002196+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:13:31.094308+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:14:31.182249+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:15:31.272610+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:16:31.363338+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:17:31.458431+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
Failed to get request counts for guanaco-submitter. Falling back to default
2026-03-25T19:18:31.551228+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
Unable to record family friendly update due to error: Invalid JSON input: Expecting value: line 1 column 1 (char 0)
2026-03-25T19:19:31.650187+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:20:31.756578+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:21:31.858855+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:22:31.967656+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:23:32.066578+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:24:32.170100+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:25:32.274238+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:26:32.383271+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:27:32.480797+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:28:32.583511+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:29:32.682545+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:30:32.781965+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:31:32.888472+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
2026-03-25T19:32:32.997729+00:00 monitor updated for chaiml-prm-kimi-v1-300_92220_v11
Tearing down inference service chaiml-prm-kimi-v1-300-92220-v11
clean up pipeline due to error=DeploymentError('Timeout to start the InferenceService chaiml-prm-kimi-v1-300-92220-v11. The InferenceService is as following: {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'kind\': \'InferenceService\', \'metadata\': {\'annotations\': {\'autoscaling.knative.dev/class\': \'hpa.autoscaling.knative.dev\', \'autoscaling.knative.dev/container-concurrency-target-percentage\': \'70\', \'autoscaling.knative.dev/initial-scale\': \'5\', \'autoscaling.knative.dev/max-scale-down-rate\': \'1.1\', \'autoscaling.knative.dev/max-scale-up-rate\': \'2\', \'autoscaling.knative.dev/metric\': \'mean_pod_latency_ms_v2\', \'autoscaling.knative.dev/panic-threshold-percentage\': \'650\', \'autoscaling.knative.dev/panic-window-percentage\': \'35\', \'autoscaling.knative.dev/scale-down-delay\': \'30s\', \'autoscaling.knative.dev/scale-to-zero-grace-period\': \'10m\', \'autoscaling.knative.dev/stable-window\': \'180s\', \'autoscaling.knative.dev/target\': \'300\', \'autoscaling.knative.dev/target-burst-capacity\': \'-1\', \'autoscaling.knative.dev/tick-interval\': \'15s\', \'features.knative.dev/http-full-duplex\': \'Enabled\', \'networking.knative.dev/ingress-class\': \'istio.ingress.networking.knative.dev\', \'serving.knative.dev/progress-deadline\': \'40m\'}, \'creationTimestamp\': \'2026-03-25T18:53:14Z\', \'finalizers\': [\'inferenceservice.finalizers\'], \'generation\': 1, \'labels\': {\'knative.coreweave.cloud/ingress\': \'istio.ingress.networking.knative.dev\', \'prometheus.k.chaiverse.com\': \'true\', \'qos.coreweave.cloud/latency\': \'low\'}, \'managedFields\': [{\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:metadata\': {\'f:annotations\': {\'.\': {}, \'f:autoscaling.knative.dev/class\': {}, \'f:autoscaling.knative.dev/container-concurrency-target-percentage\': {}, \'f:autoscaling.knative.dev/initial-scale\': {}, \'f:autoscaling.knative.dev/max-scale-down-rate\': {}, \'f:autoscaling.knative.dev/max-scale-up-rate\': {}, \'f:autoscaling.knative.dev/metric\': {}, \'f:autoscaling.knative.dev/panic-threshold-percentage\': {}, \'f:autoscaling.knative.dev/panic-window-percentage\': {}, \'f:autoscaling.knative.dev/scale-down-delay\': {}, \'f:autoscaling.knative.dev/scale-to-zero-grace-period\': {}, \'f:autoscaling.knative.dev/stable-window\': {}, \'f:autoscaling.knative.dev/target\': {}, \'f:autoscaling.knative.dev/target-burst-capacity\': {}, \'f:autoscaling.knative.dev/tick-interval\': {}, \'f:features.knative.dev/http-full-duplex\': {}, \'f:networking.knative.dev/ingress-class\': {}, \'f:serving.knative.dev/progress-deadline\': {}}, \'f:labels\': {\'.\': {}, \'f:knative.coreweave.cloud/ingress\': {}, \'f:prometheus.k.chaiverse.com\': {}, \'f:qos.coreweave.cloud/latency\': {}}}, \'f:spec\': {\'.\': {}, \'f:predictor\': {\'.\': {}, \'f:affinity\': {\'.\': {}, \'f:nodeAffinity\': {\'.\': {}, \'f:tion\': {}, \'f:requiredDuringSchedulingIgnoredDuringExecution\': {}}, \'f:podAffinity\': {\'.\': {}, \'f:tion\': {}}}, \'f:containerConcurrency\': {}, \'f:containers\': {}, \'f:imagePullSecrets\': {}, \'f:maxReplicas\': {}, \'f:minReplicas\': {}, \'f:priorityClassName\': {}, \'f:timeout\': {}, \'f:volumes\': {}}}}, \'manager\': \'OpenAPI-Generator\', \'operation\': \'Update\', \'time\': \'2026-03-25T18:53:14Z\'}, {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:metadata\': {\'f:finalizers\': {\'.\': {}, \'v:"inferenceservice.finalizers"\': {}}}}, \'manager\': \'manager\', \'operation\': \'Update\', \'time\': \'2026-03-25T18:53:14Z\'}, {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:status\': {\'.\': {}, \'f:components\': {\'.\': {}, \'f:predictor\': {\'.\': {}, \'f:latestCreatedRevision\': {}}}, \'f:conditions\': {}, \'f:modelStatus\': {\'.\': {}, \'f:lastFailureInfo\': {\'.\': {}, \'f:exitCode\': {}, \'f:message\': {}, \'f:reason\': {}}, \'f:states\': {\'.\': {}, \'f:activeModelState\': {}, \'f:targetModelState\': {}}, \'f:transitionStatus\': {}}, \'f:observedGeneration\': {}}}, \'manager\': \'manager\', \'operation\': \'Update\', \'subresource\': \'status\', \'time\': \'2026-03-25T19:33:15Z\'}], \'name\': \'chaiml-prm-kimi-v1-300-92220-v11\', \'namespace\': \'tenant-chaiml-guanaco\', \'resourceVersion\': \'548791943\', \'uid\': \'2db93cb8-ec5b-452d-aae3-dae1ed527178\'}, \'spec\': {\'predictor\': {\'affinity\': {\'nodeAffinity\': {\'tion\': [{\'preference\': {\'matchExpressions\': [{\'key\': \'gpu.nvidia.com/class\', \'operator\': \'In\', \'values\': [\'L40S\']}]}, \'weight\': 5}], \'requiredDuringSchedulingIgnoredDuringExecution\': {\'nodeSelectorTerms\': [{\'matchExpressions\': [{\'key\': \'gpu.nvidia.com/class\', \'operator\': \'In\', \'values\': [\'L40S\']}]}]}}, \'podAffinity\': {\'tion\': [{\'podAffinityTerm\': {\'labelSelector\': {\'matchLabels\': {\'serving.kserve.io/inferenceservice\': \'chaiml-prm-kimi-v1-300-92220-v11\'}}, \'topologyKey\': \'kubernetes.io/hostname\'}, \'weight\': 100}]}}, \'containerConcurrency\': 0, \'containers\': [{\'args\': [\'serve\', \'s3://guanaco-vllm-models/chaiml-prm-kimi-v1-300-92220-v11/default\', \'--port\', \'8080\', \'--tensor-parallel-size\', \'1\', \'--max-model-len\', \'10240\', \'--max-num-batched-tokens\', \'10240\', \'--max-num-seqs\', \'64\', \'--pooler-config\', \'{"use_activation": false}\', \'--load-format\', \'runai_streamer\', \'--served-model-name\', \'ChaiML/prm_kimi_v1_300k_default8b-cosine-lr1e6g32\', \'--model-loader-extra-config\', \'{"distributed": true, "concurrency": 2}\'], \'env\': [{\'name\': \'RESERVE_MEMORY\', \'value\': \'2048\'}, {\'name\': \'DOWNLOAD_TO_LOCAL\', \'value\': \'/dev/shm/model_cache\'}, {\'name\': \'NUM_GPUS\', \'value\': \'1\'}, {\'name\': \'VLLM_ASSETS_CACHE\', \'value\': \'/code/vllm_assets_cache\'}, {\'name\': \'RUNAI_STREAMER_S3_USE_VIRTUAL_ADDRESSING\', \'value\': \'1\'}, {\'name\': \'RUNAI_STREAMER_CONCURRENCY\', \'value\': \'1\'}, {\'name\': \'AWS_EC2_METADATA_DISABLED\', \'value\': \'true\'}, {\'name\': \'AWS_ACCESS_KEY_ID\', \'value\': \'CWZAGMHZXKZRFGJK\'}, {\'name\': \'AWS_SECRET_ACCESS_KEY\', \'value\': \'cwoAeWzp46q4O0sTNXOEuZ1MvZzKEFlS9DtEhnTldKp\'}, {\'name\': \'AWS_ENDPOINT_URL\', \'value\': \'https://cwobject.com\'}, {\'name\': \'HF_TOKEN\', \'valueFrom\': {\'secretKeyRef\': {\'key\': \'token\', \'name\': \'hf-token\'}}}], \'image\': \'gcr.io/chai-959f8/vllm:v0.17.1\', \'imagePullPolicy\': \'IfNotPresent\', \'name\': \'kserve-container\', \'readinessProbe\': {\'failureThreshold\': 1, \'httpGet\': {\'path\': \'/v1/models\', \'port\': 8080}, \'initialDelaySeconds\': 60, \'periodSeconds\': 10, \'successThreshold\': 1, \'timeoutSeconds\': 5}, \'resources\': {\'limits\': {\'cpu\': \'2\', \'memory\': \'64Gi\', \'nvidia.com/gpu\': \'1\'}, \'requests\': {\'cpu\': \'2\', \'memory\': \'64Gi\', \'nvidia.com/gpu\': \'1\'}}, \'volumeMounts\': [{\'mountPath\': \'/dev/shm\', \'name\': \'shared-memory-cache\'}]}], \'imagePullSecrets\': [{\'name\': \'docker-creds\'}], \'maxReplicas\': 40, \'minReplicas\': 0, \'priorityClassName\': \'chaiverse\', \'timeout\': 20, \'volumes\': [{\'emptyDir\': {\'medium\': \'Memory\', \'sizeLimit\': \'64Gi\'}, \'name\': \'shared-memory-cache\'}]}}, \'status\': {\'components\': {\'predictor\': {\'latestCreatedRevision\': \'chaiml-prm-kimi-v1-300-92220-v11-predictor-00001\'}}, \'conditions\': [{\'lastTransitionTime\': \'2026-03-25T18:55:52Z\', \'reason\': \'PredictorConfigurationReady not ready\', \'severity\': \'Info\', \'status\': \'False\', \'type\': \'LatestDeploymentReady\'}, {\'lastTransitionTime\': \'2026-03-25T19:33:15Z\', \'message\': \'Revision "chaiml-prm-kimi-v1-300-92220-v11-predictor-00001" failed with message: Container failed with: ontextlib.py", line 210, in __aenter__\\n(APIServer pid=1) return await anext(self.gen)\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 122, in build_async_engine_client_from_engine_args\\n(APIServer pid=1) vllm_config = engine_args.create_engine_config(usage_context=usage_context)\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1477, in create_engine_config\\n(APIServer pid=1) model_config = self.create_model_config()\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1329, in create_model_config\\n(APIServer pid=1) return ModelConfig(\\n(APIServer pid=1) ^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/pydantic/_internal/_dataclasses.py", line 121, in __init__\\n(APIServer pid=1) s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instance=s)\\n(APIServer pid=1) pydantic_core._pydantic_core.ValidationError: 1 validation error for ModelConfig\\n(APIServer pid=1) Value error, Invalid repository ID or local directory specified: \\\'/code/vllm_assets_cache/model_streamer/f7dc75a2\\\'.\\n(APIServer pid=1) Please verify the following requirements:\\n(APIServer pid=1) 1. Provide a valid Hugging Face repository ID.\\n(APIServer pid=1) 2. Specify a local directory that contains a recognized configuration file.\\n(APIServer pid=1) - For Hugging Face models: ensure the presence of a \\\'config.json\\\'.\\n(APIServer pid=1) - For Mistral models: ensure the presence of a \\\'params.json\\\'.\\n(APIServer pid=1) [type=value_error, input_value=ArgsKwargs((), {\\\'model\\\': ...rocessor_plugin\\\': None}), input_type=ArgsKwargs]\\n(APIServer pid=1) For further information visit https://errors.pydantic.dev/2.12/v/value_error\\n.\', \'reason\': \'RevisionFailed\', \'severity\': \'Info\', \'status\': \'False\', \'type\': \'PredictorConfigurationReady\'}, {\'lastTransitionTime\': \'2026-03-25T18:55:52Z\', \'message\': \'Configuration "chaiml-prm-kimi-v1-300-92220-v11-predictor" does not have any ready Revision.\', \'reason\': \'RevisionMissing\', \'status\': \'False\', \'type\': \'PredictorReady\'}, {\'lastTransitionTime\': \'2026-03-25T18:55:52Z\', \'message\': \'Configuration "chaiml-prm-kimi-v1-300-92220-v11-predictor" does not have any ready Revision.\', \'reason\': \'RevisionMissing\', \'severity\': \'Info\', \'status\': \'False\', \'type\': \'PredictorRouteReady\'}, {\'lastTransitionTime\': \'2026-03-25T18:55:52Z\', \'message\': \'Configuration "chaiml-prm-kimi-v1-300-92220-v11-predictor" does not have any ready Revision.\', \'reason\': \'RevisionMissing\', \'status\': \'False\', \'type\': \'Ready\'}, {\'lastTransitionTime\': \'2026-03-25T18:55:52Z\', \'reason\': \'PredictorRouteReady not ready\', \'severity\': \'Info\', \'status\': \'False\', \'type\': \'RoutesReady\'}], \'modelStatus\': {\'lastFailureInfo\': {\'exitCode\': 1, \'message\': \'ontextlib.py", line 210, in __aenter__\\n(APIServer pid=1) return await anext(self.gen)\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 122, in build_async_engine_client_from_engine_args\\n(APIServer pid=1) vllm_config = engine_args.create_engine_config(usage_context=usage_context)\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1477, in create_engine_config\\n(APIServer pid=1) model_config = self.create_model_config()\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1329, in create_model_config\\n(APIServer pid=1) return ModelConfig(\\n(APIServer pid=1) ^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/pydantic/_internal/_dataclasses.py", line 121, in __init__\\n(APIServer pid=1) s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instance=s)\\n(APIServer pid=1) pydantic_core._pydantic_core.ValidationError: 1 validation error for ModelConfig\\n(APIServer pid=1) Value error, Invalid repository ID or local directory specified: \\\'/code/vllm_assets_cache/model_streamer/f7dc75a2\\\'.\\n(APIServer pid=1) Please verify the following requirements:\\n(APIServer pid=1) 1. Provide a valid Hugging Face repository ID.\\n(APIServer pid=1) 2. Specify a local directory that contains a recognized configuration file.\\n(APIServer pid=1) - For Hugging Face models: ensure the presence of a \\\'config.json\\\'.\\n(APIServer pid=1) - For Mistral models: ensure the presence of a \\\'params.json\\\'.\\n(APIServer pid=1) [type=value_error, input_value=ArgsKwargs((), {\\\'model\\\': ...rocessor_plugin\\\': None}), input_type=ArgsKwargs]\\n(APIServer pid=1) For further information visit https://errors.pydantic.dev/2.12/v/value_error\\n\', \'reason\': \'ModelLoadFailed\'}, \'states\': {\'activeModelState\': \'\', \'targetModelState\': \'FailedToLoad\'}, \'transitionStatus\': \'BlockedByFailedLoad\'}, \'observedGeneration\': 1}}')
run pipeline stage %s
Running pipeline stage VLLMDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage VLLMDeleter completed in 0.21s
run pipeline stage %s
Running pipeline stage VLLMModelDeleter
Cleaning model data from S3
Cleaning model data from model cache
Pipeline stage VLLMModelDeleter completed in 0.34s
Shutdown handler de-registered
chaiml-prm-kimi-v1-300_92220_v11 status is now failed due to DeploymentManager action
chaiml-prm-kimi-v1-300_92220_v11 status is now torndown due to DeploymentManager action