feat: add helm charts kubernetes#704
Conversation
💥 Preprod Tests: DEPLOYMENT FAILEDTests run against preprod network with live blockchain data |
✅ Preprod Tests: PASSEDTests run against preprod network with live blockchain data |
✅ Preprod Tests: PASSEDTests run against preprod network with live blockchain data |
- Added `global.configHostPath` to specify host directory for network-specific config files. - Updated deployment documentation to pre-create the namespace and use `--no-hooks` during Helm upgrades. - Removed unnecessary config map template for node configuration files. - Deleted outdated genesis and configuration files from Helm chart.
✅ Preprod Tests: PASSEDTests run against preprod network with live blockchain data |
✅ Preprod Tests: PASSEDTests run against preprod network with live blockchain data |
❌ Preprod Tests: FAILEDTests run against preprod network with live blockchain data |
…pt for improved deployment instructions
✅ Preprod Tests: PASSEDTests run against preprod network with live blockchain data |
✅ Preprod Tests: PASSEDTests run against preprod network with live blockchain data |
…mments-for-helm-chart
- activeDeadlineSeconds: 86400 (24h) -> 259200 (72h) for index-applier Job - cardano-node startupProbe failureThreshold default: 720 (3h) -> 1920 (8h) The 24h deadline caused the index-applier pod to be killed by the garbage collector before sync completed. The 3h startup probe was insufficient for ImmutableDB replay on mainnet.
…#716) ## Summary - Index-applier Job `activeDeadlineSeconds`: 24h -> 72h - Cardano-node startup probe `failureThreshold` default: 720 (3h) -> 1920 (8h) The 24h deadline caused the index-applier pod to be killed by the garbage collector before sync completed. The 3h startup probe was insufficient for ImmutableDB replay on mainnet.
❌ Preprod Tests: FAILEDTests run against preprod network with live blockchain data |
❌ Preprod Tests: FAILEDTests run against preprod network with live blockchain data |
2 similar comments
❌ Preprod Tests: FAILEDTests run against preprod network with live blockchain data |
❌ Preprod Tests: FAILEDTests run against preprod network with live blockchain data |
| apiVersion: networking.k8s.io/v1 | ||
| kind: Ingress | ||
| metadata: | ||
| name: {{ printf "%s-rosetta-api" .Release.Name }} |
There was a problem hiding this comment.
name: {{ include "rosetta-api.fullname" . }}
| apiVersion: v1 | ||
| kind: ConfigMap | ||
| metadata: | ||
| name: {{ printf "%s-load-tests" .Release.Name }} |
| apiVersion: batch/v1 | ||
| kind: Job | ||
| metadata: | ||
| name: {{ printf "%s-index-applier" .Release.Name }} |
There was a problem hiding this comment.
I think it would be best to remove this file completely to avoid any confusion.
| component: mithril | ||
| spec: | ||
| restartPolicy: OnFailure | ||
| serviceAccountName: {{ include "cardano-rosetta-java.saName" . }} |
There was a problem hiding this comment.
I don't think that the job needs a ServiceAccount since it is not accessing the Kubernetes API.
| ## Global configuration shared across all subcharts | ||
| ## ----------------------------------------------------------------------- | ||
| global: | ||
| namespace: cardano |
There was a problem hiding this comment.
not used anymore and we don't want to predefine any namespace.
| - name: node-data | ||
| persistentVolumeClaim: | ||
| claimName: {{ include "cardano-node.pvcName" . }} | ||
| - name: node-config |
There was a problem hiding this comment.
node config needs to be a ConfigMap. hostPath cannot reliably be used in multi node setups.
|
|
||
| volumes: | ||
| - name: node-config | ||
| hostPath: |
There was a problem hiding this comment.
node config needs to be a ConfigMap. hostPath cannot reliably be used in multi node setups.
💥 Preprod Tests: DEPLOYMENT FAILEDTests run against preprod network with live blockchain data |
✅ Preprod Tests: PASSEDTests run against preprod network with live blockchain data |
…emplates Use include calls (with trunc 63) instead of inline printf patterns to avoid code duplication and prevent failures when release names are long. Added missing helpers: rosettaApiName, nodeDataPvcName, pgDataPvcName, roleName, rolebindingName, testConnectionName.
Replace standalone Mithril Job + wait-for-mithril K8s API polling with a direct mithril-download init container on cardano-node. This eliminates the need for ServiceAccount, Role, and RoleBinding since no pod accesses the Kubernetes API anymore.
Replace standalone Helm-managed PVCs with native Kubernetes volumeClaimTemplates on both cardano-node and postgresql StatefulSets. This lets K8s manage PVC lifecycle and eliminates the need for helm.sh/resource-policy annotations. Also removed unused global.namespace from values.yaml.
✅ Preprod Tests: PASSEDTests run against preprod network with live blockchain data |
✅ Preprod Tests: PASSEDTests run against preprod network with live blockchain data |
#694
What this PR does
Introduces a production-ready Kubernetes deployment for Cardano Rosetta Java using Helm
charts. The project previously shipped only a Docker Compose stack. This PR adds an
umbrella Helm chart with 5 subcharts, 3 pre-built value overlays (mainnet, preprod, K3s),
and full operator documentation — enabling deployment on K3s single-host servers and
managed cloud clusters (EKS, GKE, AKS).
All chart defaults are derived from the Docker Compose stack (
.env.docker-compose*) asthe canonical source of truth. The only intentional deviations from Compose behavior are
technically required by Kubernetes (documented below).
Chart Structure
Startup / Dependency Chain
Mirrors the Docker Compose
depends_onchain via init containers:The
cardano-nodepod runs three containers: the Cardano node process, asocatsidecar (TCP port 3002 → UNIX socket bridge), and
cardano-submit-api.READY 3/3meansall three are up.
Key Design Decisions
UNIX socket bridging via socat
Docker Compose shares the node's UNIX socket via volume mounts. Kubernetes pods cannot
share UNIX sockets across pod boundaries. A
socatsidecar insidecardano-nodeexposesthe socket as a TCP service on port 3002. Downstream pods (
yaci-indexer, postgresqlinit container) connect via TCP and use the
n2c-socatSpring profile.Node sync wait — script reuse
The postgresql
wait-for-node-syncinit container sets up a reverse socat bridge(
TCP → /tmp/node.socket), then delegates to/sbin/wait-for-node-sync.shfrom thecardano-node image — the same script used by Docker Compose's
cardano-sync-waiterservice. This ensures identical sync detection logic (progress bar output, slot
calculation from genesis files).
Automatic index-applier
The index-applier runs as a plain Kubernetes Job by default (
indexApplier.mode: automatic).It starts with the release and self-waits (via initContainer) until the API is ready.
ttlSecondsAfterFinished: 86400cleans up the completed Job within 24 hours, makingupgrades idempotent without Job immutability errors.
Operators who need explicit control can set
indexApplier.mode: hookto revert to theHelm post-install/post-upgrade hook behavior.
Hardware profiles
Three built-in profiles scale all resources proportionally:
entrymidadvancedEach profile configures CPU/memory for all 4 workloads, PostgreSQL tuning parameters
(
shared_buffers,work_mem,max_connections, etc.), and HikariCP connection pool sizes.Data persistence
PVCs for
cardano-node-dataandpostgresql-datacarryhelm.sh/resource-policy: keepso they survive
helm uninstall. The Mithril Job also carries this annotation to avoidre-downloading the full snapshot on reinstall. The index-applier Job does not — it
recreates cleanly on each deploy.
Configuration Defaults vs Docker Compose
All values in
values.yamlare derived from.env.docker-compose:global.releaseVersion"2.1.0"RELEASE_VERSIONglobal.cardanoNodeVersion"10.5.4"CARDANO_NODE_VERSIONglobal.protocolMagic"764824073"PROTOCOL_MAGIC(quoted to prevent scientific notation)global.networkmainnetNETWORKglobal.synctrueSYNCglobal.mithrilVersion2543.1-hotfixMITHRIL_VERSIONyaci-indexer.env.searchLimit100SEARCH_LIMITyaci-indexer.env.logLevelerrorLOGyaci-indexer.env.removeSpentUtxostrueREMOVE_SPENT_UTXOSrosetta-api.env.syncGraceSlotsCount100SYNC_GRACE_SLOTS_COUNTrosetta-api.env.tokenRegistryEnabledfalseTOKEN_REGISTRY_ENABLEDPreprod overrides in
values-preprod.yamlmirror.env.docker-compose-preprod:global.networkpreprodNETWORK=preprodglobal.protocolMagic"1"PROTOCOL_MAGIC=1global.profileentryyaci-indexer.env.removeSpentUtxosfalseREMOVE_SPENT_UTXOS=falseyaci-indexer.env.peerDiscoverytruePEER_DISCOVERY=trueyaci-indexer.env.logLevelINFOLOG=INFOrosetta-api.env.tokenRegistryLogoFetchtrueTOKEN_REGISTRY_LOGO_FETCH=truerosetta-api.env.tokenRegistryCacheTtlHours1TOKEN_REGISTRY_CACHE_TTL_HOURS=1Intentional K8s-only differences (kept, not aligned)
YACI_SPRING_PROFILESpostgres,n2c-socatpostgres,n2c-socketBLOCK_TRANSACTION_API_TIMEOUT_SECS1205Monitoring Stack
The optional
monitoringsubchart deploys:/actuator/prometheus),rosetta-api (
/actuator/prometheus), postgres-exporter, and node-exporter(
pg_monitorrole granted automatically)Enable with
monitoring.enabled: true(default). Access viakubectl port-forward.Files Changed
New: Helm chart
helm/cardano-rosetta-java/Chart.yamlhelm/cardano-rosetta-java/values.yamlhelm/cardano-rosetta-java/values-preprod.yamlhelm/cardano-rosetta-java/values-k3s.yamlhelm/cardano-rosetta-java/templates/helm/cardano-rosetta-java/charts/cardano-node/helm/cardano-rosetta-java/charts/postgresql/helm/cardano-rosetta-java/charts/yaci-indexer/helm/cardano-rosetta-java/charts/rosetta-api/helm/cardano-rosetta-java/charts/monitoring/helm/cardano-rosetta-java/files/helm/cardano-rosetta-java/.helmignoreNew: Documentation
docs/docs/install-and-deploy/kubernetes/overview.mddocs/docs/install-and-deploy/kubernetes/deployment.mddocs/docs/install-and-deploy/kubernetes/helm-values.mddocs/docs/install-and-deploy/kubernetes/_category_.jsonModified
.gitignorehelm/**/charts/*.tgzandhelm/**/Chart.lockto prevent build artifacts being committeddocker/dockerfiles/postgres/entrypoint.shdocs/docs/development/monitoring_setup_guide.md