|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "id": "ef9cbd78", |
| 6 | + "metadata": {}, |
| 7 | + "source": [ |
| 8 | + "# Prepare the environment for the notebook" |
| 9 | + ] |
| 10 | + }, |
| 11 | + { |
| 12 | + "cell_type": "code", |
| 13 | + "execution_count": null, |
| 14 | + "id": "b3fca639-7bf9-4199-b296-bfdccabe0b96", |
| 15 | + "metadata": {}, |
| 16 | + "outputs": [], |
| 17 | + "source": [ |
| 18 | + "# sklearn, joblib, s3fs already come with the notebook image\n", |
| 19 | + "# %pip install sklearn joblib s3fs" |
| 20 | + ] |
| 21 | + }, |
| 22 | + { |
| 23 | + "cell_type": "markdown", |
| 24 | + "id": "81293c0f", |
| 25 | + "metadata": {}, |
| 26 | + "source": [ |
| 27 | + "# Create a small model to be deployed as InferenceService" |
| 28 | + ] |
| 29 | + }, |
| 30 | + { |
| 31 | + "cell_type": "code", |
| 32 | + "execution_count": null, |
| 33 | + "id": "4d69463a-cc14-4e7e-81a1-95f7d29d60ea", |
| 34 | + "metadata": {}, |
| 35 | + "outputs": [], |
| 36 | + "source": [ |
| 37 | + "from sklearn import svm, datasets\n", |
| 38 | + "from joblib import dump" |
| 39 | + ] |
| 40 | + }, |
| 41 | + { |
| 42 | + "cell_type": "code", |
| 43 | + "execution_count": null, |
| 44 | + "id": "696e48a2-c974-4b43-9fd7-bc06f254502d", |
| 45 | + "metadata": {}, |
| 46 | + "outputs": [], |
| 47 | + "source": [ |
| 48 | + "# Create a small model with iris dataset\n", |
| 49 | + "iris = datasets.load_iris()\n", |
| 50 | + "clf = svm.SVC(gamma='scale')\n", |
| 51 | + "clf.fit(iris.data, iris.target)\n", |
| 52 | + "dump(clf, 'model.joblib')\n", |
| 53 | + "print(\"Iris model file model.joblib created!\")" |
| 54 | + ] |
| 55 | + }, |
| 56 | + { |
| 57 | + "cell_type": "markdown", |
| 58 | + "id": "e382f93b", |
| 59 | + "metadata": {}, |
| 60 | + "source": [ |
| 61 | + "# Push the created model to s3 storage (MinIO)" |
| 62 | + ] |
| 63 | + }, |
| 64 | + { |
| 65 | + "cell_type": "code", |
| 66 | + "execution_count": null, |
| 67 | + "id": "e7924d92-9812-4865-a99e-82f595e33dae", |
| 68 | + "metadata": {}, |
| 69 | + "outputs": [], |
| 70 | + "source": [ |
| 71 | + "import s3fs # for uploading the created model to minio" |
| 72 | + ] |
| 73 | + }, |
| 74 | + { |
| 75 | + "cell_type": "code", |
| 76 | + "execution_count": null, |
| 77 | + "id": "f557ff09-cbf8-49c2-b6ad-26bb648cd458", |
| 78 | + "metadata": {}, |
| 79 | + "outputs": [], |
| 80 | + "source": [ |
| 81 | + "# The notebook is already setup with minio credentials for the bucket that start with <namespace>-data\n", |
| 82 | + "s3_bucket = \"\" # Enter the name of the bucket that you want to use for uploading the model, such as the default bucket <namespace>-data\n", |
| 83 | + "if s3_bucket == \"\":\n", |
| 84 | + " raise RuntimeError(\"Please provide the name of the bucket that you want to use for uploading the model, such as the default bucket <namespace>-data\")\n", |
| 85 | + "s3_model_path = f\"{s3_bucket}/minimal-kserve-example\"\n", |
| 86 | + "print(f\"The created model will be uploaded to s3://{s3_model_path}\")" |
| 87 | + ] |
| 88 | + }, |
| 89 | + { |
| 90 | + "cell_type": "code", |
| 91 | + "execution_count": null, |
| 92 | + "id": "9dad0c7e-9718-4d76-ac0a-9ba7a843ca00", |
| 93 | + "metadata": {}, |
| 94 | + "outputs": [], |
| 95 | + "source": [ |
| 96 | + "# Upload the model to MinIO\n", |
| 97 | + "## s3fs.S3FileSystem() reads the s3 credentials and endpoint from the environment variables.\n", |
| 98 | + "## If you want to use a different s3 instance or different bucket than the default <namespace>-data, make sure to set s3fs.S3FileSystem(endpoint_url=<endpoint_url>, key=<key>, secret=<secret>)\n", |
| 99 | + "s3 = s3fs.S3FileSystem()\n", |
| 100 | + "s3.put(\"model.joblib\", f\"{s3_model_path}/model.joblib\")\n", |
| 101 | + "# List the bucket content to see if upload was successful\n", |
| 102 | + "s3.ls(s3_model_path)" |
| 103 | + ] |
| 104 | + }, |
| 105 | + { |
| 106 | + "cell_type": "markdown", |
| 107 | + "id": "0b965d76", |
| 108 | + "metadata": {}, |
| 109 | + "source": [ |
| 110 | + "# Create the InferenceService manifest that will use the uploaded model and deploy it to the cluster\n", |
| 111 | + "\n", |
| 112 | + "The model has been created and uploaded to s3 storage in the previous steps. Now we can create the InferenceService manifest that will use the uploaded model and deploy it to the cluster. \n", |
| 113 | + "You can find the InferenceService manifest template called [inferenceservice-template.yaml](inferenceservice-template.yaml) in the same folder. \n", |
| 114 | + "The manifest template contains placeholders for the model name, namespace and the s3 path to the model. \n", |
| 115 | + "Please replace the placeholders with the actual model name, namespace and s3 path to the model." |
| 116 | + ] |
| 117 | + }, |
| 118 | + { |
| 119 | + "cell_type": "markdown", |
| 120 | + "id": "bd593707-a0f5-47e8-a65b-73e467fb2c8c", |
| 121 | + "metadata": {}, |
| 122 | + "source": [ |
| 123 | + "# Deploy the InferenceService manifest to the cluster\n", |
| 124 | + "\n", |
| 125 | + "After you have created the InferenceService manifest (by replacing the placeholders in the template), with your shell at the directory containing the manifest file, you can deploy it to the cluster using the following command: \n", |
| 126 | + "```bash\n", |
| 127 | + "kubectl apply -f inferenceservice-template.yaml\n", |
| 128 | + "```\n", |
| 129 | + "\n", |
| 130 | + "This will deploy the InferenceService to the cluster and you should see the status of the InferenceService as \"Ready\" after a few moments. You can check the status of the InferenceService using the following command: \n", |
| 131 | + "```bash\n", |
| 132 | + "kubectl get inferenceservice <model-name> -n <namespace>\n", |
| 133 | + "```\n", |
| 134 | + "\n", |
| 135 | + "You can also wait for the InferenceService to be ready using the following command: \n", |
| 136 | + "```bash\n", |
| 137 | + "kubectl wait --for=condition=Ready inferenceservice <model-name> -n <namespace> --timeout=300s # Fails if the InferenceService is not ready after 5 minutes\n", |
| 138 | + "```" |
| 139 | + ] |
| 140 | + }, |
| 141 | + { |
| 142 | + "cell_type": "markdown", |
| 143 | + "id": "b575cb6d", |
| 144 | + "metadata": {}, |
| 145 | + "source": [ |
| 146 | + "# Test the deployed InferenceService with a sample request\n", |
| 147 | + "\n", |
| 148 | + "After the InferenceService is ready, you can test it with a sample request. You can use the following command to get the URL of the InferenceService: \n", |
| 149 | + "```bash\n", |
| 150 | + "kubectl get inferenceservice <model-name> -n <namespace> -o jsonpath='{.status.url}'\n", |
| 151 | + "```\n", |
| 152 | + "\n", |
| 153 | + "Enter the URL and name of the deployed InferenceService in the code cell below to test the InferenceService with a sample request." |
| 154 | + ] |
| 155 | + }, |
| 156 | + { |
| 157 | + "cell_type": "code", |
| 158 | + "execution_count": null, |
| 159 | + "id": "a5ab81fc-f9b2-4dd0-b969-58a7a2236420", |
| 160 | + "metadata": {}, |
| 161 | + "outputs": [], |
| 162 | + "source": [ |
| 163 | + "inference_service_url = \"\" # Enter the URL of the deployed InferenceService, such as https://<cluster-domain>/serving/<namespace>/<model-name>\n", |
| 164 | + "inference_service_name = \"\" # Enter the name of the deployed InferenceService, such as <model-name>\n", |
| 165 | + "if not inference_service_url or not inference_service_name:\n", |
| 166 | + " raise RuntimeError(\"Please provide the URL and name of the deployed InferenceService, such as https://<cluster-domain>/serving/<namespace>/<model-name> and <model-name>\" +\n", |
| 167 | + " \"\\nYou can get the URL of the InferenceService with the following command: \" + \n", |
| 168 | + " \"\\n kubectl get inferenceservice <model-name> -n <namespace> -o jsonpath='{.status.url}'\"\n", |
| 169 | + " )" |
| 170 | + ] |
| 171 | + }, |
| 172 | + { |
| 173 | + "cell_type": "code", |
| 174 | + "execution_count": null, |
| 175 | + "id": "ce43d966-76fa-4f7a-8a4f-e085d090656d", |
| 176 | + "metadata": {}, |
| 177 | + "outputs": [], |
| 178 | + "source": [ |
| 179 | + "# Test the deployed InferenceService with a sample POST request\n", |
| 180 | + "# The deployed service is protected by an API Key.\n", |
| 181 | + "import requests\n", |
| 182 | + "INFERENCE_SERVICE_API_KEY = \"\" # If not known, ask the cluster administrator for the API Key that is used to access the deployed InferenceServices.\n", |
| 183 | + "if not INFERENCE_SERVICE_API_KEY:\n", |
| 184 | + " raise RuntimeError(\"Please provide the API Key that will be used to test the deployed InferenceService\")\n", |
| 185 | + "request_url = f\"{inference_service_url}/v1/models/{inference_service_name}:predict\" # The request URL is the URL of the InferenceService with the path /v1/models/<model-name>:predict for prediction requests for sklearn models\n", |
| 186 | + "response = requests.post(url=request_url,headers={\"X-Api-Key\": INFERENCE_SERVICE_API_KEY},\n", |
| 187 | + " json={\"instances\": [[6.8, 2.8, 4.8, 1.4], [5.1, 3.5, 1.4, 0.2]]} # JSON post body of an iris instance is [sepal_length, sepal_width, petal_length, petal_width]\n", |
| 188 | + " )\n", |
| 189 | + "print(response.json())" |
| 190 | + ] |
| 191 | + } |
| 192 | + ], |
| 193 | + "metadata": { |
| 194 | + "kernelspec": { |
| 195 | + "display_name": "Python 3 (ipykernel)", |
| 196 | + "language": "python", |
| 197 | + "name": "python3" |
| 198 | + }, |
| 199 | + "language_info": { |
| 200 | + "codemirror_mode": { |
| 201 | + "name": "ipython", |
| 202 | + "version": 3 |
| 203 | + }, |
| 204 | + "file_extension": ".py", |
| 205 | + "mimetype": "text/x-python", |
| 206 | + "name": "python", |
| 207 | + "nbconvert_exporter": "python", |
| 208 | + "pygments_lexer": "ipython3", |
| 209 | + "version": "3.11.10" |
| 210 | + } |
| 211 | + }, |
| 212 | + "nbformat": 4, |
| 213 | + "nbformat_minor": 5 |
| 214 | +} |
0 commit comments