Skip to content

Commit 119de60

Browse files
JdubrickmaysunfaisalJslYoonmichael-valdron
authored
Release v0.1.3 (#27)
* Conditionally Add All Inference Providers (#12) * add all providers conditionally Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * add vertex ai provider Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> --------- Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * [RHDHPAI-1170] Add Question Validation Prompt Template Sync Script (#11) * add python script for syncing contents Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * add CI for running validation on PRs Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * add make commands for running script Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * run the sync with upstream prompt templates Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * cleanup whitespace Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> --------- Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * update rag items to 1.8 (#13) Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * bump latest version in readme Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * Update build commands readme.md, add cache info for local run in lightspeed-stack.yaml (#16) Signed-off-by: Lucas <lyoon@redhat.com> * [RHIDP-10054] Add YAML Formatter (#17) * v0.0.0 * add Prettier for formatting and update docs Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * resolve conflict and reformat Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> --------- Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * add default for vllm envs (#18) Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * Konflux onboarding prep work (#19) * add license Signed-off-by: Michael Valdron <mvaldron@redhat.com> * add license headers Signed-off-by: Michael Valdron <mvaldron@redhat.com> * add build step to add license file into container image Signed-off-by: Michael Valdron <mvaldron@redhat.com> * fix image tags and digests Signed-off-by: Michael Valdron <mvaldron@redhat.com> * run second build instruction as root Signed-off-by: Michael Valdron <mvaldron@redhat.com> * add enterprise contract labels for llama stack to image build Signed-off-by: Michael Valdron <mvaldron@redhat.com> * ignore existance errors from creating /license directory during image build Signed-off-by: Michael Valdron <mvaldron@redhat.com> * fix user switching order Signed-off-by: Michael Valdron <mvaldron@redhat.com> * include additional instructions about accessing host model server Signed-off-by: Michael Valdron <mvaldron@redhat.com> * feedback: remove extra descriptor for running locally Signed-off-by: Michael Valdron <mvaldron@redhat.com> * feedback: bump project version to match latest release version Signed-off-by: Michael Valdron <mvaldron@redhat.com> --------- Signed-off-by: Michael Valdron <mvaldron@redhat.com> * add troubleshooting section (#20) Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * update prompt template (#22) Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * [RHIDP-11190] Update Llama Stack To 0.3.5 (#24) * move to llama stack 0.3.4 and remove safety shield Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * update readme with 0.3 info Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * update lightspeed provider tag (could become redundant) Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * update llama stack to 0.3.5 Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * update run.yaml to llama v0.3.x standard Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * update mount reference to use 'rag-content' Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * add llama guard Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * overhaul readme Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * update no guard run Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * use experimental 1.8 rag build Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> --------- Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * update deployment doc (#25) Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * prepare for 0.1.3 release Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * update version in containerfile Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> --------- Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> Signed-off-by: Lucas <lyoon@redhat.com> Signed-off-by: Michael Valdron <mvaldron@redhat.com> Co-authored-by: Maysun Faisal <31771087+maysunfaisal@users.noreply.github.com> Co-authored-by: Lucas Yoon <94267691+JslYoon@users.noreply.github.com> Co-authored-by: Michael Valdron <mvaldron@redhat.com>
1 parent f3b3d4f commit 119de60

9 files changed

Lines changed: 458 additions & 353 deletions

File tree

Containerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,5 +72,5 @@ LABEL name=rhdh-lightspeed-llama-stack
7272
LABEL release=1.8
7373
LABEL url="https://github.com/redhat-ai-dev/llama-stack"
7474
LABEL vendor="Red Hat, Inc."
75-
LABEL version=0.1.2
75+
LABEL version=0.1.3
7676
LABEL summary="Red Hat Developer Hub Lightspeed Llama Stack"

Makefile

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
1414
# See the License for the specific language governing permissions and
1515
# limitations under the License.
16-
RAG_CONTENT_IMAGE ?= quay.io/redhat-ai-dev/rag-content:release-1.8-lcs
16+
RAG_CONTENT_IMAGE ?= quay.io/redhat-ai-dev/rag-content:experimental-release-1.8-lcs
1717
VENV := $(CURDIR)/scripts/python-scripts/.venv
1818
PYTHON := $(VENV)/bin/python3
1919
PIP := $(VENV)/bin/pip3
@@ -36,9 +36,8 @@ help: ## Show this help screen
3636
awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-33s\033[0m %s\n", $$1, $$2}'
3737
@echo ''
3838

39-
# TODO (Jdubrick): Replace reference to lightspeed-core/lightspeed-providers once bug is addressed.
4039
update-question-validation:
41-
curl -o ./config/providers.d/inline/safety/lightspeed_question_validity.yaml https://raw.githubusercontent.com/Jdubrick/lightspeed-providers/refs/heads/devai/resources/external_providers/inline/safety/lightspeed_question_validity.yaml
40+
curl -o ./config/providers.d/inline/safety/lightspeed_question_validity.yaml https://raw.githubusercontent.com/lightspeed-core/lightspeed-providers/refs/tags/0.1.17/resources/external_providers/inline/safety/lightspeed_question_validity.yaml
4241

4342
$(VENV)/bin/activate: ./scripts/python-scripts/requirements.txt
4443
python3 -m venv $(VENV)

README.md

Lines changed: 55 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,46 +1,48 @@
11
# Redhat-AI-Dev Llama Stack
22

33
[![Apache2.0 License](https://img.shields.io/badge/license-Apache2.0-brightgreen.svg)](LICENSE)
4+
[![Llama Stack Version](https://img.shields.io/badge/llama_stack-v0.3.5-blue)](https://llamastack.github.io/docs/v0.3.5)
5+
[![Python Version](https://img.shields.io/badge/python-3.12-blue)](https://www.python.org/downloads/release/python-3120/)
46

57
- [Image Availability](#image-availability)
8+
- [Latest Stable Release](#latest-stable-release)
9+
- [Latest Developer Release](#latest-developer-release)
610
- [Usage](#usage)
711
- [Available Inferences](#available-inferences)
812
- [vLLM](#vllm)
913
- [Ollama](#ollama)
1014
- [OpenAI](#openai)
15+
- [Vertex AI (Gemini)](#vertex-ai-gemini)
1116
- [Configuring RAG](#configuring-rag)
12-
- [Configuring Question Validation](#configuring-question-validation)
13-
- [Running Locally](#running-locally)
14-
- [Running on a Cluster](#running-on-a-cluster)
17+
- [Configuring Safety Guards](#configuring-safety-guards)
18+
- [Running Locally](#running-locally)
19+
- [Running on a Cluster](#running-on-a-cluster)
1520
- [Makefile Commands](#makefile-commands)
1621
- [Contributing](#contributing)
22+
- [Local Development Requirements](#local-development-requirements)
23+
- [Updating YAML Files](#updating-yaml-files)
1724
- [Troubleshooting](#troubleshooting)
1825

19-
## Image Availability
26+
# Image Availability
2027

21-
### Latest Stable Release
28+
## Latest Stable Release
2229

2330
```
24-
quay.io/redhat-ai-dev/llama-stack:0.1.2
31+
quay.io/redhat-ai-dev/llama-stack:0.1.3
2532
```
2633

27-
### Latest Developer Release
34+
## Latest Developer Release
2835

2936
```
3037
quay.io/redhat-ai-dev/llama-stack:latest
3138
```
3239

33-
## Usage
40+
# Usage
3441

3542
> [!IMPORTANT]
3643
> The default Llama Stack configuration file that is baked into the built image contains tools. Ensure your provided inference server has tool calling **enabled**.
3744
38-
**Note:** You can enable `DEBUG` logging by setting:
39-
```
40-
LLAMA_STACK_LOGGING=all=DEBUG
41-
```
42-
43-
### Available Inferences
45+
## Available Inferences
4446

4547
Each inference has its own set of environment variables. You can include all of these variables in a `.env` file and pass that instead to your container. See [default-values.env](./env/default-values.env) for a template. It is recommended you copy that file to `values.env` to avoid committing it to Git.
4648

@@ -51,7 +53,7 @@ Each inference has its own set of environment variables. You can include all of
5153
>
5254
> VLLM_API_KEY="token" ❌
5355
54-
#### vLLM
56+
### vLLM
5557

5658
**Required**
5759
```env
@@ -65,7 +67,7 @@ VLLM_MAX_TOKENS=<defaults to 4096>
6567
VLLM_TLS_VERIFY=<defaults to true>
6668
```
6769

68-
#### Ollama
70+
### Ollama
6971

7072
**Required**
7173
```env
@@ -77,7 +79,7 @@ The value of `OLLAMA_URL` is the default `http://localhost:11434`, when you are
7779

7880
The value of `OLLAMA_URL` is `http://host.containers.internal:11434` if you are running llama-stack inside a container i.e.; if you run llama-stack with the podman run command above, it needs to access the Ollama endpoint on your laptop not inside the container. **If you are using Linux**, ensure your firewall allows port 11434 to your podman container's network, some Linux distributions firewalls block all traffic by default. Alternatively you can use `OLLAMA_URL=http://localhost:11434` and set the `--network host` flag when you run your podman container.
7981

80-
#### OpenAI
82+
### OpenAI
8183

8284
**Required**
8385
```env
@@ -87,7 +89,7 @@ OPENAI_API_KEY=<your-api-key>
8789

8890
To get your API Key, go to [platform.openai.com](https://platform.openai.com/settings/organization/api-keys).
8991

90-
#### Vertex AI (Gemini)
92+
### Vertex AI (Gemini)
9193

9294
**Required**
9395
```env
@@ -99,7 +101,7 @@ GOOGLE_APPLICATION_CREDENTIALS=
99101

100102
For information about these variables see: https://llamastack.github.io/v0.2.18/providers/inference/remote_vertexai.html.
101103

102-
### Configuring RAG
104+
## Configuring RAG
103105

104106
The `run.yaml` file that is included in the container image has a RAG tool enabled. In order for this tool to have the necessary reference content, you need to run:
105107

@@ -109,25 +111,38 @@ make get-rag
109111

110112
This will fetch the necessary reference content and add it to your local project directory.
111113

112-
### Configuring Question Validation
114+
## Configuring Safety Guards
115+
116+
> [!IMPORTANT]
117+
> If you want to omit the safety guards for development purposes, you can use [run-no-guard.yaml](./run-no-guard.yaml) instead.
118+
119+
In the main [run.yaml](./run.yaml) file, Llama Guard is enabled by default. In order to avoid issues during startup you will need to ensure you have an instance of Llama Guard running.
113120

114-
By default this Llama Stack has a Safety Shield for question validation enabled. You will need to set the following environment variables to ensure functionality:
121+
You can do so by running the following to start an Ollama container with Llama Guard:
115122

116-
- `VALIDATION_PROVIDER`: The provider you want to use for question validation. This should match what the provider value you are using under `inference`, such as `vllm`, `ollama`, `openai`. Defaults to `vllm`
117-
- `VALIDATION_MODEL_NAME`: The name of the LLM you want to use for question validation
123+
```sh
124+
podman run -d --name ollama -p 11434:11434 docker.io/ollama/ollama:latest
125+
podman exec ollama ollama pull llama-guard3:8b
126+
```
127+
**Note:** Ensure the Ollama container is started and the model is ready before trying to query if deploying the containers manually.
118128

119-
### Running Locally
129+
You will need to set the following environment variables to ensure functionality:
130+
- `SAFETY_MODEL`: The name of the Llama Guard model being used. Defaults to `llama-gaurd3:8b`
131+
- `SAFETY_URL`: The URL where the container is available. Defaults to `http://host.docker.internal:11434/v1`
132+
- `SAFETY_API_KEY`: The API key required for access to the safety model. Not required for local.
133+
134+
# Running Locally
120135

121136
```
122-
podman run -it -p 8321:8321 --env-file ./env/values.env -v ./embeddings_model:/app-root/embeddings_model:Z -v ./vector_db/rhdh_product_docs:/app-root/vector_db/rhdh_product_docs:Z quay.io/redhat-ai-dev/llama-stack:latest
137+
podman run -it -p 8321:8321 --env-file ./env/values.env -v ./embeddings_model:/rag-content/embeddings_model:Z -v ./vector_db/rhdh_product_docs:/rag-content/vector_db/rhdh_product_docs:Z quay.io/redhat-ai-dev/llama-stack:latest
123138
```
124139

125140
Or if using the host network:
126141
```
127-
podman run -it -p 8321:8321 --env-file ./env/values.env --network host -v ./embeddings_model:/app-root/embeddings_model:Z -v ./vector_db/rhdh_product_docs:/app-root/vector_db/rhdh_product_docs:Z quay.io/redhat-ai-dev/llama-stack:latest
142+
podman run -it -p 8321:8321 --env-file ./env/values.env --network host -v ./embeddings_model:/rag-content/embeddings_model:Z -v ./vector_db/rhdh_product_docs:/rag-content/vector_db/rhdh_product_docs:Z quay.io/redhat-ai-dev/llama-stack:latest
128143
```
129144

130-
Latest Lightspeed Core developer image:
145+
Latest Lightspeed Core Developer Image:
131146
```
132147
quay.io/lightspeed-core/lightspeed-stack:dev-latest
133148
```
@@ -139,7 +154,7 @@ podman run -it -p 8080:8080 -v ./lightspeed-stack.yaml:/app-root/lightspeed-stac
139154

140155
**Note:** If you have built your own version of Lightspeed Core you can replace the image referenced with your own build. Additionally, you can use the Llama Stack container along with the `lightspeed-stack.yaml` file to run Lightspeed Core locally with `uv` from their [repository](https://github.com/lightspeed-core/lightspeed-stack).
141156

142-
### Running on a Cluster
157+
# Running on a Cluster
143158

144159
To deploy on a cluster see [DEPLOYMENT.md](./docs/DEPLOYMENT.md).
145160

@@ -149,17 +164,17 @@ To deploy on a cluster see [DEPLOYMENT.md](./docs/DEPLOYMENT.md).
149164
| ---- | ----|
150165
| **get-rag** | Gets the RAG data and the embeddings model from the rag-content image registry to your local project directory |
151166
| **update-question-validation** | Updates the question validation content in `providers.d` |
152-
| **validate-prompt-templates** | Validates prompt values in run.yaml. **Requires Python >= 3.11** |
153-
| **update-prompt-templates** | Updates the prompt values in run.yaml. **Requires Python >= 3.11** |
167+
| **validate-prompt-templates** | Validates prompt values in run.yaml. |
168+
| **update-prompt-templates** | Updates the prompt values in run.yaml. |
154169

155-
## Contributing
170+
# Contributing
156171

157-
### Local Development Requirements
172+
## Local Development Requirements
158173

159174
- [Yarn](https://yarnpkg.com/)
160175
- [Node.js >= v22](https://nodejs.org/en/about/previous-releases)
161176

162-
### Updating YAML Files
177+
## Updating YAML Files
163178

164179
This repository implements Prettier to handle all YAML formatting.
165180
```sh
@@ -169,7 +184,13 @@ yarn verify # Runs Prettier to check the YAML files in this repository
169184

170185
If you wish to try new changes with Llama Stack, you can build your own image using the `Containerfile` in the root of this repository.
171186

172-
## Troubleshooting
187+
# Troubleshooting
188+
189+
>[!NOTE]
190+
> You can enable `DEBUG` logging by setting:
191+
>```
192+
>LLAMA_STACK_LOGGING=all=DEBUG
193+
>```
173194
174195
If you experience an error related to permissions for the `vector_db`, such as:
175196

0 commit comments

Comments
 (0)