Skip to content

Commit dcde385

Browse files
authored
Merge pull request #53 from sauagarwa/main
Changed default model to ibm-granite/granite-3.1-8b-instruct
2 parents 3e7fd92 + e0355d4 commit dcde385

File tree

9 files changed

+25
-24
lines changed

9 files changed

+25
-24
lines changed

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -67,9 +67,9 @@ _Figure 4. Schematic diagram for Ingestion of data for RAG._
6767
_Figure 5. Schematic diagram for RAG demo augmented query._
6868

6969

70-
In Figure 5, we can see RAG augmented query. Community version of [Mistral-7B-Instruct](https://huggingface.co/mistral-community/Mistral-7B-Instruct-v0.3) model is used for language processing, LangChain to
70+
In Figure 5, we can see RAG augmented query. [IBM Granite 3.1-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.1-8b-instruct) model is used for language processing, LangChain to
7171
integrate different tools of the LLM-based application together and to process the PDF
72-
files and web pages, vector database provider such as EDB Postgres for Kubernetes or Redis, is used to store vectors, and [Red Hat OpenShift AI](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai) to serve the [Mistral-7B-Instruct](https://huggingface.co/mistral-community/Mistral-7B-Instruct-v0.3) model, Gradio is used for user interface and object storage to store language model and other datasets.
72+
files and web pages, vector database provider such as EDB Postgres for Kubernetes or Redis, is used to store vectors, and [Red Hat OpenShift AI](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai) to serve the [IBM Granite 3.1-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.1-8b-instruct) model, Gradio is used for user interface and object storage to store language model and other datasets.
7373
Solution components are deployed as microservices in the Red Hat OpenShift cluster.
7474

7575

@@ -84,7 +84,7 @@ _Figure 6. Proposed demo architecture with OpenShift AI_
8484

8585
### Components deployed
8686

87-
- **vLLM Text Generation Inference Server:** The pattern deploys a vLLM Inference Server. The server deploys and serves `mistral-community/Mistral-7B-Instruct-v0.3` model. The server will require a GPU node.
87+
- **vLLM Text Generation Inference Server:** The pattern deploys a vLLM Inference Server. The server deploys and serves `ibm-granite/granite-3.1-8b-instruct` model. The server will require a GPU node.
8888
- **EDB Postgres for Kubernetes / Redis Server:** A Vector Database server is deployed to store vector embeddings created from Red Hat product documentation.
8989
- **Populate VectorDb Job:** The job creates the embeddings and populates the vector database.
9090
- **LLM Application:** This is a Chatbot application that can generate a project proposal by augmenting the LLM with the Red Hat product documentation stored in vector db.
@@ -112,7 +112,7 @@ cd rag-llm-gitops
112112
113113
### Configuring model
114114
115-
This pattern deploys community version of [Mistral-7B-Instruct](https://huggingface.co/mistral-community/Mistral-7B-Instruct-v0.3) out of box. Run the following command to configure vault with the model Id.
115+
This pattern deploys [IBM Granite 3.1-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.1-8b-instruct) out of box. Run the following command to configure vault with the model Id.
116116
117117
```sh
118118
# Copy values-secret.yaml.template to ~/values-secret-rag-llm-gitops.yaml.
@@ -121,7 +121,7 @@ This pattern deploys community version of [Mistral-7B-Instruct](https://huggingf
121121
cp values-secret.yaml.template ~/values-secret-rag-llm-gitops.yaml
122122
```
123123
124-
To deploy a non-community [Mistral-7b-Instruct](https://huggingface.co/mistralai/) model, grab the [Hugging Face token](https://huggingface.co/settings/tokens) and accept the terms and conditions on the model page. Edit ~/values-secret-rag-llm-gitops.yaml to replace the `model Id` and the `Hugging Face` token.
124+
To deploy a model that can requires an Hugging Face token, grab the [Hugging Face token](https://huggingface.co/settings/tokens) and accept the terms and conditions on the model page. Edit ~/values-secret-rag-llm-gitops.yaml to replace the `model Id` and the `Hugging Face` token.
125125
126126
```sh
127127
secrets:
@@ -130,7 +130,7 @@ secrets:
130130
- name: hftoken
131131
value: null
132132
- name: modelId
133-
value: "mistral-community/Mistral-7B-Instruct-v0.3"
133+
value: "ibm-granite/granite-3.1-8b-instruct"
134134
- name: minio
135135
fields:
136136
- name: MINIO_ROOT_USER

charts/all/llm-monitoring/kustomize/base/grafanadashboard/ai-llm-dashboard.yaml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -174,8 +174,8 @@ spec:
174174
"scopedVars": {
175175
"ModelID": {
176176
"selected": false,
177-
"text": "mistralai/Mistral-7B-Instruct-v0.1",
178-
"value": "mistralai/Mistral-7B-Instruct-v0.1"
177+
"text": "ibm-granite/granite-3.1-8b-instruct",
178+
"value": "ibm-granite/granite-3.1-8b-instruct"
179179
}
180180
},
181181
"targets": [
@@ -326,8 +326,8 @@ spec:
326326
"scopedVars": {
327327
"ModelID": {
328328
"selected": false,
329-
"text": "mistralai/Mistral-7B-Instruct-v0.1",
330-
"value": "mistralai/Mistral-7B-Instruct-v0.1"
329+
"text": "ibm-granite/granite-3.1-8b-instruct",
330+
"value": "ibm-granite/granite-3.1-8b-instruct"
331331
}
332332
},
333333
"targets": [
@@ -462,8 +462,8 @@ spec:
462462
"scopedVars": {
463463
"ModelID": {
464464
"selected": false,
465-
"text": "mistralai/Mistral-7B-Instruct-v0.1",
466-
"value": "mistralai/Mistral-7B-Instruct-v0.1"
465+
"text": "ibm-granite/granite-3.1-8b-instruct",
466+
"value": "ibm-granite/granite-3.1-8b-instruct"
467467
}
468468
},
469469
"targets": [

charts/all/llm-serving-service/templates/inference-service.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,11 @@ apiVersion: serving.kserve.io/v1beta1
22
kind: InferenceService
33
metadata:
44
annotations:
5-
openshift.io/display-name: mistral-7b-instruct
5+
openshift.io/display-name: ibm-granite-instruct
66
serving.knative.openshift.io/enablePassthrough: 'true'
77
sidecar.istio.io/inject: 'true'
88
sidecar.istio.io/rewriteAppHTTPProbers: 'true'
9-
name: mistral-7b-instruct
9+
name: ibm-granite-instruct
1010
namespace: rag-llm
1111
labels:
1212
opendatahub.io/dashboard: 'true'
@@ -27,7 +27,7 @@ spec:
2727
cpu: '2'
2828
memory: 8Gi
2929
nvidia.com/gpu: '1'
30-
runtime: mistral-7b-instruct
30+
runtime: ibm-granite-instruct
3131
restartPolicy: OnFailure
3232
tolerations:
3333
- effect: NoSchedule

charts/all/llm-serving-service/templates/serving-runtime.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,8 @@ metadata:
55
opendatahub.io/accelerator-name: nvidia-gpu
66
opendatahub.io/apiProtocol: REST
77
opendatahub.io/recommended-accelerators: '["nvidia.com/gpu"]'
8-
openshift.io/display-name: mistral-7b-instruct
9-
name: mistral-7b-instruct
8+
openshift.io/display-name: ibm-granite-instruct
9+
name: ibm-granite-instruct
1010
namespace: rag-llm
1111
labels:
1212
opendatahub.io/dashboard: 'true'
@@ -19,7 +19,7 @@ spec:
1919
- '--port=8080'
2020
- '--model=/cache/models'
2121
- '--distributed-executor-backend=mp'
22-
- '--served-model-name=mistral-7b-instruct'
22+
- '--served-model-name=ibm-granite-instruct'
2323
- '--max-model-len=4096'
2424
- '--dtype=half'
2525
- '--gpu-memory-utilization'
@@ -44,7 +44,7 @@ spec:
4444
name: huggingface-secret
4545
- name: HF_HUB_OFFLINE
4646
value: '0'
47-
image: 'quay.io/modh/vllm@sha256:b51fde66f162f1a78e8c027320dddf214732d5345953b1599a84fe0f0168c619'
47+
image: 'quay.io/modh/vllm@sha256:c86ff1e89c86bc9821b75d7f2bbc170b3c13e3ccf538bf543b1110f23e056316'
4848
name: kserve-container
4949
ports:
5050
- containerPort: 8080

charts/all/rag-llm/files/config.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,10 @@ llm_providers:
22
- name: "OpenShift AI (vLLM)"
33
enabled: True
44
models:
5-
- name: mistral-7b-instruct
5+
- name: ibm-granite-instruct
66
weight: 1
77
enabled: True
8-
url: https://mistral-7b-instruct-{{ .Values.llmui.namespace }}.{{ coalesce .Values.global.localClusterDomain .Values.global.hubClusterDomain }}/v1
8+
url: https://ibm-granite-instruct-{{ .Values.llmui.namespace }}.{{ coalesce .Values.global.localClusterDomain .Values.global.hubClusterDomain }}/v1
99
params:
1010
- name: max_new_tokens
1111
value: 1024

values-global.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ global:
1111
type: EDB
1212
# Add for model ID
1313
model:
14-
modelId: mistral-community/Mistral-7B-Instruct-v0.3
14+
modelId: ibm-granite/granite-3.1-8b-instruct
1515
main:
1616
clusterGroupName: hub
1717
multiSourceConfig:

values-hub.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ clusterGroup:
5959
# We can use self-referential variables because the chart calls the tpl function with these variables defined
6060
sharedValueFiles:
6161
- '/overrides/values-{{ $.Values.global.clusterPlatform }}.yaml'
62+
- 'values-rag-llm-gitops.yaml'
6263
# sharedValueFiles is a flexible mechanism that will add the listed valuefiles to every app defined in the
6364
# applications section. We intend this to supplement and possibly even replace previous "magic" mechanisms, though
6465
# we do not at present have a target date for removal.

values-rag-llm-gitops.yaml

Whitespace-only changes.

values-secret.yaml.template

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ version: "2.0"
55
# Ideally you NEVER COMMIT THESE VALUES TO GIT (although if all passwords are
66
# automatically generated inside the vault this should not really matter)
77

8-
# In order to use a the standard verison of mistralai/Mistral-7B-Instruct-v0.3
8+
# In order to use a the standard verison of ibm-granite/granite-3.1-8b-instruct
99
# you will need to do the following:
1010
# provide your token as a value for hftoken
1111
# NOTE: you need to add value in values-global.yaml as well
@@ -16,7 +16,7 @@ secrets:
1616
- name: hftoken
1717
value: null
1818
- name: modelId
19-
value: "mistral-community/Mistral-7B-Instruct-v0.3"
19+
value: "ibm-granite/granite-3.1-8b-instruct"
2020
- name: minio
2121
fields:
2222
- name: MINIO_ROOT_USER

0 commit comments

Comments
 (0)