Skip to content

Latest commit

Β 

History

History
507 lines (377 loc) Β· 14.1 KB

File metadata and controls

507 lines (377 loc) Β· 14.1 KB

Multi-Container StatefulSet Pattern

All web services in Charon use a standardized multi-container StatefulSet architecture with optional lifecycle automation. This pattern provides security, reliability, and consistency across all deployments.

Core Pattern: 3 containers (nginx-tls + application + Tailscale) With Lifecycle: 5 containers (+ init cleanup + DNS creation sidecar)

Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚           StatefulSet Pod (e.g., grafana-0)      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
β”‚  β”‚ nginx-tls    β”‚  β”‚ Application  β”‚  β”‚ tail   β”‚β”‚
β”‚  β”‚              β”‚  β”‚              β”‚  β”‚ scale  β”‚β”‚
β”‚  β”‚ Port: 443    │─▢│ Port: 8080   β”‚  β”‚        β”‚β”‚
β”‚  β”‚ (TLS term)   β”‚  β”‚ (localhost)  β”‚  β”‚ VPN    β”‚β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
β”‚         β–²                  β”‚                β–²   β”‚
β”‚         β”‚                  β”‚                β”‚   β”‚
β”‚    TLS certs          App data        VPN stateβ”‚
β”‚    (ConfigMap)        (PVC)          (emptyDir) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Container 1: nginx-tls (TLS Termination)

Purpose: HTTPS termination and reverse proxy

Configuration:

  • Image: nginx:alpine
  • Port: 443 (HTTPS)
  • Config: Mounted ConfigMap with server blocks
  • Certificates: Mounts wildcard TLS certificate from cert-manager

Responsibilities:

  • Terminates TLS connections
  • Redirects HTTP (80) β†’ HTTPS (443)
  • Proxies HTTPS β†’ localhost:8080
  • IP-based access control (VPN IPs only)

Example nginx config:

server {
    listen 443 ssl;
    server_name grafana.example.com;

    ssl_certificate /etc/nginx/certs/tls.crt;
    ssl_certificate_key /etc/nginx/certs/tls.key;

    # Only allow VPN IPs
    allow 100.64.0.0/10;
    allow 127.0.0.1;
    deny all;

    location / {
        proxy_pass http://localhost:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Container 2: Application (Main Service)

Purpose: Runs the actual web application

Configuration:

  • Port: 8080 (localhost only - not exposed externally)
  • Storage: Persistent volume via volumeClaimTemplate
  • Networking: Only accessible via nginx proxy

Why localhost:8080?

  • Security: Cannot be accessed directly from outside the pod
  • Simplicity: Application doesn't need TLS configuration
  • Flexibility: Easy to swap applications without changing TLS setup

Examples:

  • Grafana: Runs on port 3000, proxied from nginx
  • Open-WebUI: Runs on port 8080
  • Redmine: Runs on port 3000

Container 3: tailscale (VPN Sidecar)

Purpose: Connects pod to Tailscale VPN mesh

Configuration:

  • Image: tailscale/tailscale:latest
  • State Storage: emptyDir volume (ephemeral)
  • Auth: Uses TS_KUBE_SECRET for state persistence
  • Security: Requires NET_ADMIN capability for TUN device

Why emptyDir?

  • Prevents PVC binding failures during pod startup
  • Tailscale state is reconstructed from Kubernetes secret
  • No data loss - node registration persists in Headscale
  • Faster pod restarts

Environment Variables (Modern Pattern):

- TS_AUTHKEY: file:///tailscale-auth/authkey # Read from file (NEW)
- TS_HOSTNAME: service.example.com # VPN hostname
- TS_KUBE_SECRET: service-tailscale-state # K8s secret for state
- TS_ACCEPT_DNS: "true" # Enable MagicDNS
- TS_EXTRA_ARGS: --login-server=http://headscale.core.svc.cluster.local:8080

Old Pattern (deprecated):

- TS_AUTHKEY: <from-kubernetes-secret> # Generated by Terraform data.external

Lifecycle Containers (Modern Pattern)

Services using the modern lifecycle pattern include additional containers for automation. See Ollama as the reference implementation.

Init Container: lifecycle-cleanup

Purpose: Pre-startup cleanup and auth key generation

Image: localhost/tailscale-lifecycle:latest (built in-cluster via DaemonSet)

Responsibilities:

  1. Cleanup old Headscale nodes - Removes stale VPN registrations for this hostname
  2. Delete stale DNS records - Cleans up old Cloudflare A records
  3. Generate pre-auth key - Creates fresh Headscale pre-auth key
  4. Write key to file - Saves key to /tailscale-auth/authkey (shared volume)

Why this matters:

  • Eliminates Terraform data.external dependencies
  • No more Python script calls during terraform plan
  • Portable across platforms (Windows, Linux, macOS)
  • Race-free key generation at pod startup
  • Self-healing: always clean slate on restart

Configuration:

env:
  - name: HEADSCALE_NAMESPACE
    value: "core"
  - name: CLOUDFLARE_API_TOKEN
    value: "your-token"
  - name: CLOUDFLARE_ZONE_ID
    value: "your-zone-id"
  - name: DOMAIN_NAME
    value: "example.com"
  - name: AUTHKEY_OUTPUT_PATH
    value: "/tailscale-auth/authkey"
volumeMounts:
  - name: tailscale-auth
    mountPath: /tailscale-auth

Sidecar Container: lifecycle-dns-create

Purpose: Create DNS record after Tailscale registers

Image: localhost/tailscale-lifecycle:latest

Responsibilities:

  1. Wait for Tailscale registration - Polls Headscale API until node appears
  2. Get VPN IP address - Retrieves assigned 100.64.x.x IP
  3. Create DNS A record - Points hostname to VPN IP in Cloudflare
  4. Monitor continuously - Updates DNS if IP changes

Why a sidecar?

  • Runs alongside main application
  • Non-blocking: app starts while DNS propagates
  • Self-healing: re-creates DNS if deleted
  • Independent lifecycle from application

Configuration:

env:
  - name: HEADSCALE_NAMESPACE
    value: "core"
  - name: CLOUDFLARE_API_TOKEN
    value: "your-token"
  - name: CLOUDFLARE_ZONE_ID
    value: "your-zone-id"
  - name: DOMAIN_NAME
    value: "example.com"
  - name: MAX_RETRIES
    value: "30"
  - name: RETRY_INTERVAL
    value: "10"

Lifecycle Image Build

The localhost/tailscale-lifecycle:latest image is built in-cluster using a DaemonSet:

Build Process (terraform/tailscale-lifecycle-build.tf):

  1. DaemonSet runs on all nodes
  2. Init container uses Buildah to build Python image
  3. Scripts sourced from external repository (not in this repo)
  4. Image saved to hostPath (/var/lib/tailscale-lifecycle-image.tar)
  5. Imported into containerd via ctr -n k8s.io images import
  6. Available as localhost/tailscale-lifecycle:latest on each node

Note: Lifecycle scripts (lifecycle_cleanup.py, lifecycle_dns_create.py) are NOT in this repository's scripts/ directory. They're managed in a separate repository and were previously included as a git submodule (now removed).

Migration Status

Modern Pattern (lifecycle-generated keys):

  • βœ… Ollama - Reference implementation

Old Pattern (Terraform data.external):

  • ⏳ Redmine
  • ⏳ FreeIPA
  • ⏳ Grafana
  • ⏳ Open-WebUI
  • ⏳ ArgoCD

See REFACTOR_CHECKLIST.md for migration tracking.

Key Benefits

Security

  • TLS Everywhere: All traffic encrypted
  • VPN-Only Access: nginx restricts to 100.64.0.0/10
  • No Direct Exposure: Apps only on localhost
  • IP Allowlisting: Built into nginx config

Stability

  • Stable Pod Names: StatefulSet provides service-0, service-1, etc.
  • Stable Storage: PVCs follow pod through restarts
  • Stable VPN IPs: Headscale maintains IP assignments
  • Graceful Restarts: Pods restart individually

Reliability

  • No PVC Deadlocks: emptyDir for Tailscale prevents binding issues
  • Automatic Cleanup: Offline nodes removed by post-deployment script
  • Health Checks: Each container can have its own liveness probe
  • Isolated Failures: Container restart doesn't affect pod

Consistency

  • Uniform Pattern: All services deployed the same way
  • Predictable Behavior: Know what to expect from any service
  • Easy Debugging: Same troubleshooting steps for all services
  • Reusable Configs: Templates work for any new service

Maintainability

  • Clear Separation: Each container has single responsibility
  • Easy Updates: Update application without changing TLS config
  • Observable: kubectl logs <pod> -c <container> for each part
  • Testable: Can test each container independently

NGINX Ingress Controllers

Charon deploys two NGINX Ingress Controllers with different security profiles:

Internal Controller (VPN-Only)

Purpose: Route traffic to services from VPN clients only

Configuration:

  • Type: DaemonSet (one per node)
  • Network: hostNetwork: true (binds to node ports 80/443)
  • Tailscale: Integrated directly (no sidecar)
  • Hostname: nginx-ingress-<node-name>
  • Ingress Class: nginx (default)

Access Control:

  • Restricts to Tailscale IP range: 100.64.0.0/10
  • All service Ingresses use this controller
  • No public internet access

External Controller (Enrollment Only)

Purpose: Public access for Headscale VPN enrollment

Configuration:

  • Type: LoadBalancer (public IP)
  • Tailscale: Integrated directly
  • Ingress Class: nginx-external
  • Namespace: ingress-nginx-external

Use Cases:

  • Headscale web UI enrollment
  • Initial VPN client registration
  • Public-facing enrollment endpoint

Security:

  • Only Headscale Ingress uses this controller
  • All other services use internal controller
  • Limited attack surface

Deployment Pattern

Terraform Template

resource "kubernetes_stateful_set" "my_service" {
  count = var.my_service_enabled ? 1 : 0

  metadata {
    name      = "my-service"
    namespace = var.namespace
  }

  spec {
    replicas     = 1
    service_name = "my-service"

    selector {
      match_labels = { app = "my-service" }
    }

    template {
      metadata {
        labels = { app = "my-service" }
      }

      spec {
        # Container 1: NGINX TLS
        container {
          name  = "nginx-tls"
          image = "nginx:alpine"
          port { container_port = 443 }

          volume_mount {
            name       = "tls-certs"
            mount_path = "/etc/nginx/certs"
          }
          volume_mount {
            name       = "nginx-conf"
            mount_path = "/etc/nginx/conf.d"
          }
        }

        # Container 2: Application
        container {
          name  = "my-service"
          image = "myapp:latest"
          port { container_port = 8080 }

          volume_mount {
            name       = "app-data"
            mount_path = "/app/data"
          }
        }

        # Container 3: Tailscale (if enabled)
        dynamic "container" {
          for_each = var.my_service_tailscale_enabled ? [1] : []
          content {
            name  = "tailscale"
            image = "tailscale/tailscale:latest"

            security_context {
              capabilities { add = ["NET_ADMIN"] }
            }

            env {
              name  = "TS_KUBE_SECRET"
              value = "tailscale"
            }
            env {
              name  = "TS_HOSTNAME"
              value = "my-service.example.com"
            }

            volume_mount {
              name       = "tailscale-state"
              mount_path = "/var/lib/tailscale"
            }
          }
        }

        # Volumes
        volume {
          name = "tls-certs"
          secret { secret_name = "wildcard-tls" }
        }
        volume {
          name = "nginx-conf"
          config_map { name = "my-service-nginx" }
        }
        volume {
          name = "tailscale-state"
          empty_dir {}
        }
      }
    }

    # Persistent storage
    volume_claim_template {
      metadata { name = "app-data" }
      spec {
        access_modes = ["ReadWriteOnce"]
        resources {
          requests = { storage = "10Gi" }
        }
      }
    }
  }
}

Debugging

Check Container Status

# Get pod status
kubectl get pods -n dev my-service-0

# Check all containers
kubectl get pod my-service-0 -n dev -o jsonpath='{.status.containerStatuses[*].name}'

# Check if all containers are ready
kubectl get pod my-service-0 -n dev -o jsonpath='{.status.containerStatuses[*].ready}'

View Container Logs

# nginx-tls logs
kubectl logs -n dev my-service-0 -c nginx-tls

# Application logs
kubectl logs -n dev my-service-0 -c my-service

# Tailscale logs
kubectl logs -n dev my-service-0 -c tailscale

# Follow logs
kubectl logs -n dev my-service-0 -c my-service -f

Exec into Containers

# Shell into application container
kubectl exec -n dev my-service-0 -c my-service -it -- /bin/sh

# Test nginx config
kubectl exec -n dev my-service-0 -c nginx-tls -- nginx -t

# Check Tailscale status
kubectl exec -n dev my-service-0 -c tailscale -- tailscale status

Common Issues

Container won't start:

kubectl describe pod my-service-0 -n dev
# Look for ImagePullBackOff, CrashLoopBackOff, etc.

Can't reach service:

# Check if Tailscale is connected
kubectl exec -n dev my-service-0 -c tailscale -- tailscale status

# Check nginx is listening
kubectl exec -n dev my-service-0 -c nginx-tls -- netstat -tlnp

# Test localhost connection
kubectl exec -n dev my-service-0 -c nginx-tls -- curl http://localhost:8080

Related Documentation


Navigation: πŸ“š Documentation Index | 🏠 Home