-
Notifications
You must be signed in to change notification settings - Fork 20
Description
Execution of inference_stack_deploy.sh menu option "Update Deployed Inference Cluster --> Manage LLM Models" fails when attempting to deploy or undeploy models with the error "Kubernetes server is not reachable. Aborting Inference Stack Deployment", despite a healthy, running cluster prior to execution.
Our user has sudo privileges, but it requires the user's password. We are able to fix the above error by caching the user's sudo authorization by running any sudo command prior to execution of inference_stack_deploy.sh, such as "sudo echo sudoing".
EI version is 1.3.1
Here is the output of the failure:
user@master1:~/Enterprise-Inference/core$ ./inference-stack-deploy.sh
| Intel AI for Enterprise Inference |
|---|
| 1) Provision Enterprise Inference Cluster |
| 2) Decommission Existing Cluster |
| 3) Update Deployed Inference Cluster |
| 4) Brownfield Deployment of Enterprise Inference |
| --------------------------------------------------------- |
| Please choose an option (1, 2, 3 or 4): |
3
| Update Existing Cluster |
|---|
| 1) Manage Worker Nodes |
| 2) Manage LLM Models |
| ------------------------------------------------ |
| Please choose an option (1 or 2): |
2
| Manage LLM Models |
|---|
| 1) Deploy Model |
| 2) Undeploy Model |
| 3) List Installed Models |
| 4) Deploy Model from Hugging Face |
| 5) Remove Model using deployment name |
| ------------------------------------------------ |
| Please choose an option (1, 2, 3, or 4): |
1
Configuration file found, setting vars!
Metadata configuration file found, setting vars!
....
TASK [kubernetes-precheck : Check if kubectl is Available] ***********************************************************************************************************************
ok: [master1]
Monday 08 December 2025 09:59:07 -0600 (0:00:00.434) 0:00:00.829 *******
TASK [kubernetes-precheck : Fail if kubectl is not Available] ********************************************************************************************************************
fatal: [master1]: FAILED! => {"changed": false, "msg": "Kubernetes server is not reachable. Aborting Inference Stack Deployment"}