feat(storagebox): add Gateway API routing and replace MinIO with Garage#111
feat(storagebox): add Gateway API routing and replace MinIO with Garage#111adamancini wants to merge 4 commits intomainfrom
Conversation
Replace ingress-nginx with Envoy Gateway as the Gateway API controller, installed as an EC extension via OCI chart. Each application gets its own Gateway resource with an independent Envoy proxy instance: - Garage S3: HTTP Gateway + HTTPRoute (port 3900) - PostgreSQL: TCP Gateway + TCPRoute (port 5432) - Cassandra: TCP Gateway + TCPRoute (port 9042) - rqlite: HTTP Gateway + HTTPRoute (port 4001) - NFS: stays on NodePort (Gateway API does not support UDP) Replace MinIO operator + Tenant subchart with Garage v1.3.1, a lightweight S3-compatible object storage that runs as a single StatefulSet with no operator dependency. A post-install/post-upgrade Helm hook Job handles cluster layout assignment, bucket creation, and S3 credential provisioning via the Garage admin API. An init container copies secrets to an emptyDir with mode 0600 to satisfy Garage's strict file permission requirements. Also includes: - Per-service gateway and TLS settings in KOTS admin console config - Helm test for Garage connectivity and S3 round-trip verification - Support bundle collectors and deployment health analyzers for all infrastructure (cert-manager, CNPG, Envoy Gateway, K8ssandra) - Status informers for infrastructure deployments - Builder key for air-gap image discovery - NFS kernel module preflight upgraded to hard fail - Consolidated all utility images to alpine:3.21 (removed busybox) - vm-kubectl Makefile target for remote kubectl on EC VMs - Updated CI workflow and smoke tests for Garage
Covers per-application Gateway pattern with Envoy Gateway, HTTPRoute for S3/HTTP services, TCPRoute for databases, GatewayClass/EnvoyProxy infrastructure, TLS termination, and KOTS config integration. All examples drawn from the storagebox application. Notes that TCPRoute's experimental status is point-in-time (February 2026) and that Traefik supports TCPRoute when experimental CRDs are installed separately.
7ffd5b6 to
afc198b
Compare
scottrigby
left a comment
There was a problem hiding this comment.
Glad to see a Gateway API pattern!
This PR looks great, except for one question (below)
| CA_CERT="/var/run/secrets/kubernetes.io/serviceaccount/ca.crt" | ||
| SECRET_NAME="{{ include "storagebox.fullname" . }}-garage-s3" | ||
|
|
||
| # ---- Test 1: Admin API health check ---- |
There was a problem hiding this comment.
I like that you added helm tests 💯
My only question is, since $GARAGE_ADMIN is a local k8s service (assuming from):
env:
- name: GARAGE_ADMIN
value: "http://{{ .Release.Name }}-garage:3903"
why do we need to call it with a cert / bearer token? This isn't a pattern I see normally for service communication within the same chart.
There was a problem hiding this comment.
Good catch — the cert/bearer token usage was confusing because two separate auth contexts were mixed together:
-
Garage admin token (
Authorization: Bearer ${ADMIN_TOKEN}) — this is Garage's own application-level auth for its admin API on port 3903. Required for management endpoints like listing buckets, not related to K8s auth. -
K8s SA token + CA cert (
SA_TOKEN,CA_CERT) — these were used to call the Kubernetes API server (not Garage) to verify the S3 credentials Secret exists and to read its contents for the round-trip test. That's where the cert/bearer pattern came from.
Simplified in the latest push: the test now mounts the S3 credentials Secret directly as a volume instead of fetching it via the K8s API at runtime. This removes the serviceAccountName, K8s API calls, SA token, and CA cert entirely. The only Bearer token left is Garage's admin token, which is clearly commented as application-level auth.
…ctly Remove Kubernetes API calls from the helm test pod. Instead of fetching the S3 credentials Secret via the K8s API with SA token + CA cert, mount it directly as a volume. This eliminates the serviceAccountName, KUBE_API, SA_TOKEN, and CA_CERT plumbing that was confusing two auth contexts (Garage app-level auth vs K8s API auth).
Summary
Replace ingress-nginx with Envoy Gateway as the Gateway API controller and replace MinIO with Garage for S3-compatible object storage.
Gateway API (Envoy Gateway)
Each application gets its own Gateway resource. Envoy Gateway provisions an independent Envoy proxy Deployment + NodePort Service per Gateway, providing full isolation.
GatewayClass+EnvoyProxyresource configures NodePort for EC environmentsoci://docker.io/envoyproxy/gateway-helmv1.7.0), bundles all Gateway API CRDs including experimental TCPRouteGarage S3 Storage (replaces MinIO)
chmod 0600(KubernetesfsGroupadds group-read bits to secret volume mounts, but Garage requires exactly mode 0600)alpine:3.21+curl+jqfor Garage admin API calls:Operational improvements
deploymentStatus/statefulsetStatusanalyzers for all infrastructure (cert-manager, CNPG, Envoy Gateway, K8ssandra, cass-operator) and application componentshelm templatealpine:3.21(removed busybox)helmUpgradeFlagsvm-kubectltarget for remote kubectl on EC VMs; removed minio-operator from test-install-operatorsPatterns doc
New
patterns/gateway-api/README.mdcovering per-application Gateway pattern, HTTPRoute/TCPRoute examples, EnvoyProxy/GatewayClass infrastructure, TLS termination, and KOTS integration. Notes TCPRoute experimental status is point-in-time.Test plan
helm lintpasseshelm templaterenders all resources with all components enabledmake validate-configfour-way contract passeshelm-install-test(pending with Garage v1.3.1 fixes)