Skip to content

Commit d8d0be7

Browse files
committed
feat(helm): add optional PostgreSQL backing store with Secret-based credentials
Add postgres.enabled toggle supporting three modes: SQLite (default), bundled Bitnami PostgreSQL (internal), and external PostgreSQL. Database credentials are stored in a Kubernetes Secret and injected via the OPENSHELL_DB_URL env var to avoid exposing passwords in CLI args, pod specs, or process listings. Passwords are URL-encoded via urlquery, and required guards prevent misconfiguration (missing password or host).
1 parent 9e5aee4 commit d8d0be7

13 files changed

Lines changed: 470 additions & 2 deletions

File tree

Lines changed: 215 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,215 @@
1+
---
2+
name: deploy-openshell-cluster
3+
description: Deploy OpenShell gateway with Helm on Kubernetes or OpenShift, auto-detect cluster type, apply OpenShift-only SCC/security overrides when needed, and configure SQLite or PostgreSQL persistence. Use when the user asks to deploy OpenShell to a cluster, reinstall the Helm release, or enable postgres.enabled with internal or external mode.
4+
---
5+
6+
# Deploy OpenShell Cluster
7+
8+
Use `deploy/helm/openshell/README.md` as the source of truth, then apply this workflow.
9+
10+
Default behavior:
11+
12+
- SQLite by default (`postgres.enabled=false`)
13+
- Optional PostgreSQL (`postgres.enabled=true`) via:
14+
- `postgres.mode=internal` (deploy bundled Postgres dependency)
15+
- `postgres.mode=external` (use external database settings)
16+
17+
## Inputs
18+
19+
```bash
20+
NAMESPACE="${NAMESPACE:-openshell}"
21+
RELEASE_NAME="${RELEASE_NAME:-openshell}"
22+
CHART_REF="${CHART_REF:-oci://ghcr.io/nvidia/openshell/helm-chart}"
23+
CHART_VERSION="${CHART_VERSION:-}"
24+
GATEWAY_TAG="${GATEWAY_TAG:-}" # e.g. dev or fa84e437...
25+
POSTGRES_ENABLED="${POSTGRES_ENABLED:-false}" # true|false
26+
POSTGRES_MODE="${POSTGRES_MODE:-internal}" # internal|external
27+
POSTGRES_DB="${POSTGRES_DB:-openshell}"
28+
POSTGRES_USER="${POSTGRES_USER:-openshell}"
29+
POSTGRES_PASSWORD="${POSTGRES_PASSWORD:-}" # required when postgres is enabled
30+
POSTGRES_HOST="${POSTGRES_HOST:-}" # required for external mode
31+
POSTGRES_PORT="${POSTGRES_PORT:-5432}"
32+
```
33+
34+
## Step 1: Verify cluster login
35+
36+
Before any deployment action, confirm the user is authenticated to a cluster.
37+
38+
```bash
39+
if ! kubectl auth can-i get pods >/dev/null 2>&1; then
40+
echo "Not authenticated to a Kubernetes/OpenShift cluster."
41+
echo "Please log in first (for OpenShift: oc login <api-server>), then retry."
42+
exit 1
43+
fi
44+
```
45+
46+
If the check fails, stop and ask the user to log in before continuing.
47+
48+
## Step 2: Choose namespace (with upgrade prompt)
49+
50+
Namespace selection rules:
51+
52+
1. If user explicitly provides a namespace, use it.
53+
2. If user does not provide a namespace, default to `openshell`.
54+
3. If `openshell` already has a running gateway and user did not explicitly ask for upgrade, ask:
55+
- upgrade existing deployment in `openshell`, or
56+
- deploy fresh into a new namespace.
57+
58+
Detect existing gateway in `openshell`:
59+
60+
```bash
61+
EXISTING_IN_OPENSHIFT=false
62+
if helm status openshell -n openshell >/dev/null 2>&1; then
63+
EXISTING_IN_OPENSHIFT=true
64+
elif kubectl get statefulset openshell -n openshell >/dev/null 2>&1; then
65+
EXISTING_IN_OPENSHIFT=true
66+
fi
67+
```
68+
69+
When `EXISTING_IN_OPENSHIFT=true` and namespace was not explicitly specified, stop and ask the user for a choice before proceeding.
70+
71+
## Step 3: Select gateway/chart version
72+
73+
If user explicitly provides `GATEWAY_TAG`, use it.
74+
75+
If user explicitly provides `CHART_VERSION`, use it as-is.
76+
77+
If neither `GATEWAY_TAG` nor `CHART_VERSION` is provided:
78+
79+
1. Fetch recent gateway tags from [GHCR package page](https://github.com/nvidia/OpenShell/pkgs/container/openshell%2Fgateway) (or equivalent API/CLI output).
80+
2. Ask the user which tag to deploy.
81+
3. Convert chosen gateway tag to Helm chart dev format:
82+
- gateway tag `dev` -> `CHART_VERSION=0.0.0-dev`
83+
- gateway tag `<tag>` (commit-like or custom) -> `CHART_VERSION=0.0.0-<tag>`
84+
85+
Example prompt to user:
86+
87+
- "I found recent gateway tags: `dev`, `fa84e437...`, `3460e5fd...`. Which one should I deploy?"
88+
89+
If user does not choose, default to:
90+
91+
```bash
92+
GATEWAY_TAG="dev"
93+
CHART_VERSION="0.0.0-dev"
94+
```
95+
96+
If `GATEWAY_TAG` is provided and `CHART_VERSION` is empty:
97+
98+
```bash
99+
CHART_VERSION="0.0.0-${GATEWAY_TAG}"
100+
```
101+
102+
If `CHART_VERSION` is provided and `GATEWAY_TAG` is empty, derive `GATEWAY_TAG` when possible:
103+
104+
```bash
105+
case "${CHART_VERSION}" in
106+
0.0.0-*) GATEWAY_TAG="${CHART_VERSION#0.0.0-}" ;;
107+
*) GATEWAY_TAG="dev" ;; # fallback
108+
esac
109+
```
110+
111+
## Step 4: Detect cluster type
112+
113+
```bash
114+
CLUSTER_TYPE="kubernetes"
115+
if kubectl get clusterversion version >/dev/null 2>&1; then
116+
CLUSTER_TYPE="openshift"
117+
fi
118+
echo "Detected cluster type: ${CLUSTER_TYPE}"
119+
```
120+
121+
## Step 5: Install shared prerequisites
122+
123+
```bash
124+
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/latest/download/manifest.yaml
125+
kubectl get namespace "${NAMESPACE}" >/dev/null 2>&1 || kubectl create namespace "${NAMESPACE}"
126+
```
127+
128+
## Step 6: Apply OpenShift-only prerequisites
129+
130+
Run only when `CLUSTER_TYPE=openshift`.
131+
132+
```bash
133+
if [ "${CLUSTER_TYPE}" = "openshift" ]; then
134+
oc adm policy add-scc-to-user privileged -z openshell-sandbox -n "${NAMESPACE}"
135+
136+
# The PKI init job is disabled on OpenShift (SCC constraints), but the
137+
# gateway still needs JWT signing keys for per-sandbox authentication.
138+
# Generate them manually if the Secret does not already exist.
139+
JWT_SECRET="${RELEASE_NAME}-jwt-keys"
140+
if ! kubectl get secret "${JWT_SECRET}" -n "${NAMESPACE}" >/dev/null 2>&1; then
141+
TMPDIR=$(mktemp -d)
142+
openssl genpkey -algorithm Ed25519 -out "${TMPDIR}/signing.pem"
143+
openssl pkey -in "${TMPDIR}/signing.pem" -pubout -out "${TMPDIR}/public.pem"
144+
openssl rand -hex 16 > "${TMPDIR}/kid"
145+
kubectl create secret generic "${JWT_SECRET}" -n "${NAMESPACE}" \
146+
--from-file=signing.pem="${TMPDIR}/signing.pem" \
147+
--from-file=public.pem="${TMPDIR}/public.pem" \
148+
--from-file=kid="${TMPDIR}/kid"
149+
rm -rf "${TMPDIR}"
150+
echo "Created JWT signing secret ${JWT_SECRET}"
151+
else
152+
echo "JWT signing secret ${JWT_SECRET} already exists"
153+
fi
154+
fi
155+
```
156+
157+
## Step 7: Deploy Helm release
158+
159+
```bash
160+
HELM_ARGS=(
161+
upgrade --install "${RELEASE_NAME}" "${CHART_REF}"
162+
--version "${CHART_VERSION}"
163+
--namespace "${NAMESPACE}"
164+
--set "image.tag=${GATEWAY_TAG}"
165+
--set "supervisor.image.tag=${GATEWAY_TAG}"
166+
--set "postgres.enabled=${POSTGRES_ENABLED}"
167+
--wait
168+
)
169+
170+
if [ "${POSTGRES_ENABLED}" = "true" ]; then
171+
HELM_ARGS+=(--set "postgres.mode=${POSTGRES_MODE}")
172+
if [ "${POSTGRES_MODE}" = "external" ]; then
173+
HELM_ARGS+=(
174+
--set "postgres.external.host=${POSTGRES_HOST}"
175+
--set "postgres.external.port=${POSTGRES_PORT}"
176+
--set "postgres.external.username=${POSTGRES_USER}"
177+
--set "postgres.external.password=${POSTGRES_PASSWORD}"
178+
--set "postgres.external.database=${POSTGRES_DB}"
179+
)
180+
else
181+
HELM_ARGS+=(
182+
--set "postgres.auth.username=${POSTGRES_USER}"
183+
--set "postgres.auth.password=${POSTGRES_PASSWORD}"
184+
--set "postgres.auth.database=${POSTGRES_DB}"
185+
)
186+
fi
187+
fi
188+
189+
if [ "${CLUSTER_TYPE}" = "openshift" ]; then
190+
HELM_ARGS+=(
191+
--set pkiInitJob.enabled=false
192+
--set server.disableTls=true
193+
--set podSecurityContext.fsGroup=null
194+
--set securityContext.runAsUser=null
195+
)
196+
fi
197+
198+
helm "${HELM_ARGS[@]}"
199+
```
200+
201+
This keeps Kubernetes installs aligned with the README default `helm install` path and applies OpenShift-specific overrides only on OpenShift.
202+
203+
## Step 8: Verify deployment
204+
205+
```bash
206+
kubectl get pods -n "${NAMESPACE}"
207+
kubectl rollout status statefulset/"${RELEASE_NAME}" -n "${NAMESPACE}"
208+
helm get values "${RELEASE_NAME}" -n "${NAMESPACE}"
209+
```
210+
211+
Check persistence mode:
212+
213+
- SQLite default: `postgres.enabled=false`
214+
- Internal Postgres: `postgres.enabled=true`, `postgres.mode=internal`
215+
- External Postgres: `postgres.enabled=true`, `postgres.mode=external`

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -187,6 +187,9 @@ rootfs/
187187
# Docker build artifacts (image tarballs, packaged helm charts)
188188
deploy/docker/.build/
189189

190+
# Helm subchart tarballs (regenerated by `helm dependency build`)
191+
deploy/helm/openshell/charts/
192+
190193
# SBOM generated output (JSON, CSV) — release artifacts, not committed
191194
deploy/sbom/output/
192195

CONTRIBUTING.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,7 @@ Skills live in `.agents/skills/`. Your agent's harness can discover and load the
7878
| Reviewing | `test-release-canary` | Dispatch and iterate on the Release Canary workflow that smoke-tests published artifacts |
7979
| Triage | `triage-issue` | Assess, classify, and route community-filed issues |
8080
| Platform | `generate-sandbox-policy` | Generate YAML sandbox policies from requirements or API docs |
81+
| Platform | `deploy-openshell-cluster`| Deploy OpenShell gateway on Kubernetes or OpenShift with optional PostgreSQL settings |
8182
| Platform | `tui-development` | Development guide for the ratatui-based terminal UI |
8283
| Documentation | `update-docs` | Scan recent commits and draft doc updates for user-facing changes |
8384
| Maintenance | `sync-agent-infra` | Detect and fix drift across agent-first infrastructure files |

deploy/helm/openshell/Chart.lock

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
dependencies:
2+
- name: postgresql
3+
repository: oci://registry-1.docker.io/bitnamicharts
4+
version: 18.6.7
5+
digest: sha256:e4df764483edb0695ac56dd4e27eb3a225a9c0b0ef52a8b60e3e0b51e36153ab
6+
generated: "2026-05-26T16:28:10.636508-04:00"

deploy/helm/openshell/Chart.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,9 @@ type: application
1111
# empty), so a released chart automatically pulls the matching gateway and supervisor images.
1212
version: 0.0.0
1313
appVersion: "0.0.0"
14+
dependencies:
15+
- name: postgresql
16+
version: 18.6.7
17+
repository: oci://registry-1.docker.io/bitnamicharts
18+
condition: postgres.enabled
19+
alias: postgres

deploy/helm/openshell/README.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,47 @@ See [`values.yaml`](values.yaml) for source defaults. Selected overlays:
5858
- [`ci/values-cert-manager.yaml`](ci/values-cert-manager.yaml) - cert-manager integration
5959
- [`ci/values-keycloak.yaml`](ci/values-keycloak.yaml) - Keycloak OIDC integration
6060

61+
### Database backend
62+
63+
By default, OpenShell uses SQLite:
64+
65+
```yaml
66+
server:
67+
dbUrl: "sqlite:/var/openshell/openshell.db"
68+
postgres:
69+
enabled: false
70+
```
71+
72+
Enable bundled PostgreSQL:
73+
74+
```bash
75+
helm install openshell oci://ghcr.io/nvidia/openshell/helm-chart --version <version> \
76+
--set postgres.enabled=true \
77+
--set postgres.auth.password=my-secret-password
78+
```
79+
80+
Use external PostgreSQL:
81+
82+
```bash
83+
helm install openshell oci://ghcr.io/nvidia/openshell/helm-chart --version <version> \
84+
--set postgres.enabled=true \
85+
--set postgres.mode=external \
86+
--set postgres.external.host=my-postgres.example.com \
87+
--set postgres.external.port=5432 \
88+
--set postgres.external.database=openshell \
89+
--set postgres.external.username=openshell \
90+
--set postgres.external.password=my-password
91+
```
92+
93+
Or provide a full connection URL directly:
94+
95+
```bash
96+
helm install openshell oci://ghcr.io/nvidia/openshell/helm-chart --version <version> \
97+
--set postgres.enabled=true \
98+
--set postgres.mode=external \
99+
--set postgres.external.url="postgres://user:pass@host:5432/db?sslmode=require"
100+
```
101+
61102
## PKI bootstrap
62103

63104
By default, a pre-install/pre-upgrade hook Job runs `openshell-gateway generate-certs`

deploy/helm/openshell/templates/_helpers.tpl

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,32 @@ Namespace where sandbox pods are created. An explicit
102102
{{- .Values.server.sandboxNamespace | default .Release.Namespace -}}
103103
{{- end }}
104104

105+
{{/*
106+
Gateway database URL.
107+
- postgres.enabled=false: use .Values.server.dbUrl (default sqlite)
108+
- postgres.enabled=true + mode=internal: derive URL from bundled postgres service
109+
- postgres.enabled=true + mode=external: use external.url or compose external fields
110+
*/}}
111+
{{- define "openshell.dbUrl" -}}
112+
{{- if .Values.postgres.enabled -}}
113+
{{- if eq (default "internal" .Values.postgres.mode) "external" -}}
114+
{{- if .Values.postgres.external.url -}}
115+
{{- .Values.postgres.external.url -}}
116+
{{- else -}}
117+
{{- $host := required "postgres.external.host is required when postgres.mode=external and no postgres.external.url is provided" .Values.postgres.external.host -}}
118+
{{- $pw := required "postgres.external.password is required when postgres.mode=external and no postgres.external.url is provided" .Values.postgres.external.password -}}
119+
{{- printf "postgres://%s:%s@%s:%d/%s" (.Values.postgres.external.username | urlquery) ($pw | urlquery) $host (int (default 5432 .Values.postgres.external.port)) .Values.postgres.external.database -}}
120+
{{- end -}}
121+
{{- else -}}
122+
{{- $pw := required "postgres.auth.password must be set when postgres.enabled=true" .Values.postgres.auth.password -}}
123+
{{- $host := .Values.postgres.host | default (printf "%s-postgres.%s.svc.cluster.local" .Release.Name .Release.Namespace) -}}
124+
{{- printf "postgres://%s:%s@%s:%d/%s" (.Values.postgres.auth.username | urlquery) ($pw | urlquery) $host (int .Values.postgres.port) .Values.postgres.auth.database -}}
125+
{{- end -}}
126+
{{- else -}}
127+
{{- .Values.server.dbUrl -}}
128+
{{- end -}}
129+
{{- end }}
130+
105131
{{/*
106132
gRPC endpoint sandbox pods use to call back into the gateway. An explicit
107133
.Values.server.grpcEndpoint is used verbatim. Otherwise it is derived from
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
{{- if .Values.postgres.enabled }}
5+
apiVersion: v1
6+
kind: Secret
7+
metadata:
8+
name: {{ include "openshell.fullname" . }}-db
9+
labels:
10+
{{- include "openshell.labels" . | nindent 4 }}
11+
type: Opaque
12+
stringData:
13+
db-url: {{ include "openshell.dbUrl" . | quote }}
14+
{{- end }}

deploy/helm/openshell/templates/gateway-config.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,8 @@ at startup. CLI flags and OPENSHELL_* env vars on the StatefulSet container
88
still override anything in this file.
99

1010
One value is intentionally NOT rendered here:
11-
- server.dbUrl → passed via --db-url in the StatefulSet args
11+
- server.dbUrl → passed via OPENSHELL_DB_URL env var (from Secret)
12+
when postgres.enabled=true, or --db-url arg for SQLite
1213
*/}}
1314
apiVersion: v1
1415
kind: ConfigMap

deploy/helm/openshell/templates/statefulset.yaml

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,9 @@ spec:
2121
# without this annotation a `helm upgrade` that only mutates the
2222
# ConfigMap would leave pods running with stale config.
2323
checksum/gateway-config: {{ include (print $.Template.BasePath "/gateway-config.yaml") . | sha256sum }}
24+
{{- if .Values.postgres.enabled }}
25+
checksum/db-secret: {{ include (print $.Template.BasePath "/db-secret.yaml") . | sha256sum }}
26+
{{- end }}
2427
{{- with .Values.podAnnotations }}
2528
{{- toYaml . | nindent 8 }}
2629
{{- end }}
@@ -54,9 +57,18 @@ spec:
5457
args:
5558
- --config
5659
- /etc/openshell/gateway.toml
60+
{{- if not .Values.postgres.enabled }}
5761
- --db-url
5862
- {{ .Values.server.dbUrl | quote }}
63+
{{- end }}
5964
env:
65+
{{- if .Values.postgres.enabled }}
66+
- name: OPENSHELL_DB_URL
67+
valueFrom:
68+
secretKeyRef:
69+
name: {{ include "openshell.fullname" . }}-db
70+
key: db-url
71+
{{- end }}
6072
# All gateway settings live in the ConfigMap-backed TOML file
6173
# mounted at /etc/openshell/gateway.toml. The only env var below
6274
# is a process-level setting consumed by libraries outside
@@ -137,7 +149,7 @@ spec:
137149
- name: sandbox-jwt
138150
secret:
139151
secretName: {{ .Values.server.sandboxJwt.signingSecretName | default (printf "%s-jwt-keys" (include "openshell.fullname" .)) }}
140-
defaultMode: 0400
152+
defaultMode: 0444
141153
{{- if not .Values.server.disableTls }}
142154
- name: tls-cert
143155
secret:

0 commit comments

Comments
 (0)