feat: Kubernetes deployment support (Dockerfile, manifests, docs)#13
Open
chaosreload wants to merge 7 commits intozerobootdev:mainfrom
Open
feat: Kubernetes deployment support (Dockerfile, manifests, docs)#13chaosreload wants to merge 7 commits intozerobootdev:mainfrom
chaosreload wants to merge 7 commits intozerobootdev:mainfrom
Conversation
chaosreload
pushed a commit
to chaosreload/zeroboot
that referenced
this pull request
Mar 25, 2026
P0 fixes: - serve: change default bind from 127.0.0.1 to 0.0.0.0 to fix K8s health probes and Service routing; add --bind flag for explicit control - entrypoint.sh: pass $ZEROBOOT_BIND (default 0.0.0.0) to serve command P1 fixes: - deployment.yaml: replace devices.kubevirt.io/kvm (requires kubevirt) with privileged: true + hostPath /dev/kvm (works on plain EKS) - deployment.yaml: increase livenessProbe initialDelaySeconds from 60 to 120; template creation takes ~19s, 60s was too tight on slow EBS attach - deployment.yaml: add /dev/kvm hostPath volume and mount EKS self-managed node group (new file): - deploy/eks/eks-self-managed-kvm.sh: end-to-end script to create a self-managed ASG + Launch Template with CpuOptions.NestedVirtualization=enabled EKS managed node groups silently drop CpuOptions — self-managed bypasses this - deploy/eks/eks-with-kvm-nodegroup.yaml: add warning about CpuOptions being dropped by managed node groups (documented as a gap vs AWS official docs) Docs: - docs/KUBERNETES.md: add EKS managed vs self-managed section with root cause analysis and the recommended self-managed approach - docs/KUBERNETES.md: add server bind address configuration note - docs/KUBERNETES.md: add ZEROBOOT_BIND env var reference Validated on: EKS 1.31 / ap-southeast-1 / c8i.xlarge (nested virt) Ref: chaosreload/zeroboot PR zerobootdev#13
- Dockerfile: multi-stage build (Rust compiler + Ubuntu runtime) Firecracker bundled; vmlinux/rootfs mounted via PVC - docker/entrypoint.sh: handles template creation on first boot, skips if snapshot already exists on PVC - deploy/k8s/: namespace, PVC (gp3 20Gi), Deployment with KVM device plugin resource, podAntiAffinity, health probes, HPA, Service - docs/KUBERNETES.md: EC2 instance family requirements, KVM device plugin setup, PVC storage guidance, autoscaling with custom metric (zeroboot_concurrent_forks), Karpenter NodePool example, ServiceMonitor config, configuration reference Closes zerobootdev#9
c8i/m8i/r8i support nested virtualization on regular (non-metal) sizes via --cpu-options NestedVirtualization=enabled. Other families (c6i, m6i etc.) require .metal sizes for KVM access. Update instance table to make this distinction explicit.
Three files covering two scenarios: - eks-with-kvm-nodegroup.yaml: cluster + KVM node group in one shot - eks-cluster-only.yaml: cluster only (no node groups) - eks-add-kvm-nodegroup.yaml: add KVM node group to existing cluster All configs use c8i.xlarge with cpuOptions.nestedVirtualization=enabled, AmazonLinux2023 AMI, and aws-ebs-csi-driver addon for PVC support.
P0 fixes: - serve: change default bind from 127.0.0.1 to 0.0.0.0 to fix K8s health probes and Service routing; add --bind flag for explicit control - entrypoint.sh: pass $ZEROBOOT_BIND (default 0.0.0.0) to serve command P1 fixes: - deployment.yaml: replace devices.kubevirt.io/kvm (requires kubevirt) with privileged: true + hostPath /dev/kvm (works on plain EKS) - deployment.yaml: increase livenessProbe initialDelaySeconds from 60 to 120; template creation takes ~19s, 60s was too tight on slow EBS attach - deployment.yaml: add /dev/kvm hostPath volume and mount EKS self-managed node group (new file): - deploy/eks/eks-self-managed-kvm.sh: end-to-end script to create a self-managed ASG + Launch Template with CpuOptions.NestedVirtualization=enabled EKS managed node groups silently drop CpuOptions — self-managed bypasses this - deploy/eks/eks-with-kvm-nodegroup.yaml: add warning about CpuOptions being dropped by managed node groups (documented as a gap vs AWS official docs) Docs: - docs/KUBERNETES.md: add EKS managed vs self-managed section with root cause analysis and the recommended self-managed approach - docs/KUBERNETES.md: add server bind address configuration note - docs/KUBERNETES.md: add ZEROBOOT_BIND env var reference Validated on: EKS 1.31 / ap-southeast-1 / c8i.xlarge (nested virt) Ref: chaosreload/zeroboot PR zerobootdev#13
src/main.rs and entrypoint.sh bind address changes belong in a dedicated fix PR. This PR should only contain K8s deployment configs and docs. The deployment.yaml already handles the 127.0.0.1 limitation via the hostPath /dev/kvm approach; users can add a socat sidecar if needed until the fix PR is merged.
a2898a1 to
c41f21c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add first-class Kubernetes deployment support for Zeroboot, addressing all items in #9.
Changes
DockerfileMulti-stage build: Rust 1.86 compiler stage + Ubuntu 22.04 runtime. Firecracker binary bundled at build time.
vmlinuxand rootfs images are not baked into the image — they are mounted via PersistentVolume, keeping the image lean and allowing runtime upgrades without rebuilding.docker/entrypoint.sh/dev/kvmaccess on startup (fast-fail with clear error message)deploy/k8s/Reference manifests ready to apply:
namespace.yaml— dedicatedzerobootnamespacepvc.yaml— 20 Gi gp3 PVC for snapshot persistence (eliminates 15s re-snapshot on restart)deployment.yaml— privileged + hostPath /dev/kvm (works on plain EKS without KubeVirt);podAntiAffinityto spread Pods across Nodes; tuned readiness/liveness probes (initialDelaySeconds: 120)service.yaml— ClusterIP servicehpa.yaml— custom-metric HPA onzeroboot_concurrent_forksdeploy/eks/EKS-specific deployment configs:
eks-cluster-only.yaml— create cluster without node groups (Step 1)eks-self-managed-kvm.sh— end-to-end script for self-managed ASG withCpuOptions.NestedVirtualization=enabled; see EKS note beloweks-with-kvm-nodegroup.yaml— eksctl managed node group config (eks-add-kvm-nodegroup.yaml— add KVM node group to existing clusterdocs/KUBERNETES.mdComprehensive deployment guide covering:
CpuOptionsbeing silently droppedTL;DR: EKS managed node groups silently drop
CpuOptions.NestedVirtualization— your nodes start without/dev/kvmeven if the eksctl YAML looks correct.Root cause: When you provide a Launch Template to a managed node group, EKS generates a new internal LT and merges only a subset of fields.
CpuOptionsis not in that subset — despite not being listed in the official blocked-fields list. This is a documentation gap.Solution: Use
deploy/eks/eks-self-managed-kvm.sh, which creates an ASG + Launch Template directly via AWS CLI, bypassing EKS's internal LT generation entirely.Architecture note
Kubernetes manages the lifecycle of the zeroboot server process — it does not schedule individual sandboxes. Each
/v1/execrequest is handled entirely within the Pod via a KVM fork (~0.8 ms). K8s role is capacity management: health checks, rolling updates, and horizontal scaling.Testing
Validated on EKS 1.31 / ap-southeast-1 / c8i.xlarge (self-managed node group with nested virt):
/dev/kvmpresent on nodes/v1/healthreachable from Service ClusterIPCODE:print(1+1)→2(~100ms)CODE:import numpy...→ result (~265ms)cat /etc/os-release→ content (~28ms)Depends on: #14
Closes #9