All web services in Charon use a standardized multi-container StatefulSet architecture with optional lifecycle automation. This pattern provides security, reliability, and consistency across all deployments.
Core Pattern: 3 containers (nginx-tls + application + Tailscale) With Lifecycle: 5 containers (+ init cleanup + DNS creation sidecar)
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β StatefulSet Pod (e.g., grafana-0) β
βββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββββββ ββββββββββββββββ βββββββββββ
β β nginx-tls β β Application β β tail ββ
β β β β β β scale ββ
β β Port: 443 βββΆβ Port: 8080 β β ββ
β β (TLS term) β β (localhost) β β VPN ββ
β ββββββββββββββββ ββββββββββββββββ βββββββββββ
β β² β β² β
β β β β β
β TLS certs App data VPN stateβ
β (ConfigMap) (PVC) (emptyDir) β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
Purpose: HTTPS termination and reverse proxy
Configuration:
- Image:
nginx:alpine - Port: 443 (HTTPS)
- Config: Mounted ConfigMap with server blocks
- Certificates: Mounts wildcard TLS certificate from cert-manager
Responsibilities:
- Terminates TLS connections
- Redirects HTTP (80) β HTTPS (443)
- Proxies HTTPS β localhost:8080
- IP-based access control (VPN IPs only)
Example nginx config:
server {
listen 443 ssl;
server_name grafana.example.com;
ssl_certificate /etc/nginx/certs/tls.crt;
ssl_certificate_key /etc/nginx/certs/tls.key;
# Only allow VPN IPs
allow 100.64.0.0/10;
allow 127.0.0.1;
deny all;
location / {
proxy_pass http://localhost:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}Purpose: Runs the actual web application
Configuration:
- Port: 8080 (localhost only - not exposed externally)
- Storage: Persistent volume via volumeClaimTemplate
- Networking: Only accessible via nginx proxy
Why localhost:8080?
- Security: Cannot be accessed directly from outside the pod
- Simplicity: Application doesn't need TLS configuration
- Flexibility: Easy to swap applications without changing TLS setup
Examples:
- Grafana: Runs on port 3000, proxied from nginx
- Open-WebUI: Runs on port 8080
- Redmine: Runs on port 3000
Purpose: Connects pod to Tailscale VPN mesh
Configuration:
- Image:
tailscale/tailscale:latest - State Storage: emptyDir volume (ephemeral)
- Auth: Uses
TS_KUBE_SECRETfor state persistence - Security: Requires
NET_ADMINcapability for TUN device
Why emptyDir?
- Prevents PVC binding failures during pod startup
- Tailscale state is reconstructed from Kubernetes secret
- No data loss - node registration persists in Headscale
- Faster pod restarts
Environment Variables (Modern Pattern):
- TS_AUTHKEY: file:///tailscale-auth/authkey # Read from file (NEW)
- TS_HOSTNAME: service.example.com # VPN hostname
- TS_KUBE_SECRET: service-tailscale-state # K8s secret for state
- TS_ACCEPT_DNS: "true" # Enable MagicDNS
- TS_EXTRA_ARGS: --login-server=http://headscale.core.svc.cluster.local:8080Old Pattern (deprecated):
- TS_AUTHKEY: <from-kubernetes-secret> # Generated by Terraform data.externalServices using the modern lifecycle pattern include additional containers for automation. See Ollama as the reference implementation.
Purpose: Pre-startup cleanup and auth key generation
Image: localhost/tailscale-lifecycle:latest (built in-cluster via DaemonSet)
Responsibilities:
- Cleanup old Headscale nodes - Removes stale VPN registrations for this hostname
- Delete stale DNS records - Cleans up old Cloudflare A records
- Generate pre-auth key - Creates fresh Headscale pre-auth key
- Write key to file - Saves key to
/tailscale-auth/authkey(shared volume)
Why this matters:
- Eliminates Terraform
data.externaldependencies - No more Python script calls during
terraform plan - Portable across platforms (Windows, Linux, macOS)
- Race-free key generation at pod startup
- Self-healing: always clean slate on restart
Configuration:
env:
- name: HEADSCALE_NAMESPACE
value: "core"
- name: CLOUDFLARE_API_TOKEN
value: "your-token"
- name: CLOUDFLARE_ZONE_ID
value: "your-zone-id"
- name: DOMAIN_NAME
value: "example.com"
- name: AUTHKEY_OUTPUT_PATH
value: "/tailscale-auth/authkey"
volumeMounts:
- name: tailscale-auth
mountPath: /tailscale-authPurpose: Create DNS record after Tailscale registers
Image: localhost/tailscale-lifecycle:latest
Responsibilities:
- Wait for Tailscale registration - Polls Headscale API until node appears
- Get VPN IP address - Retrieves assigned 100.64.x.x IP
- Create DNS A record - Points hostname to VPN IP in Cloudflare
- Monitor continuously - Updates DNS if IP changes
Why a sidecar?
- Runs alongside main application
- Non-blocking: app starts while DNS propagates
- Self-healing: re-creates DNS if deleted
- Independent lifecycle from application
Configuration:
env:
- name: HEADSCALE_NAMESPACE
value: "core"
- name: CLOUDFLARE_API_TOKEN
value: "your-token"
- name: CLOUDFLARE_ZONE_ID
value: "your-zone-id"
- name: DOMAIN_NAME
value: "example.com"
- name: MAX_RETRIES
value: "30"
- name: RETRY_INTERVAL
value: "10"The localhost/tailscale-lifecycle:latest image is built in-cluster using a DaemonSet:
Build Process (terraform/tailscale-lifecycle-build.tf):
- DaemonSet runs on all nodes
- Init container uses Buildah to build Python image
- Scripts sourced from external repository (not in this repo)
- Image saved to hostPath (
/var/lib/tailscale-lifecycle-image.tar) - Imported into containerd via
ctr -n k8s.io images import - Available as
localhost/tailscale-lifecycle:lateston each node
Note: Lifecycle scripts (lifecycle_cleanup.py, lifecycle_dns_create.py) are NOT in this repository's scripts/ directory. They're managed in a separate repository and were previously included as a git submodule (now removed).
Modern Pattern (lifecycle-generated keys):
- β Ollama - Reference implementation
Old Pattern (Terraform data.external):
- β³ Redmine
- β³ FreeIPA
- β³ Grafana
- β³ Open-WebUI
- β³ ArgoCD
See REFACTOR_CHECKLIST.md for migration tracking.
- TLS Everywhere: All traffic encrypted
- VPN-Only Access: nginx restricts to 100.64.0.0/10
- No Direct Exposure: Apps only on localhost
- IP Allowlisting: Built into nginx config
- Stable Pod Names: StatefulSet provides
service-0,service-1, etc. - Stable Storage: PVCs follow pod through restarts
- Stable VPN IPs: Headscale maintains IP assignments
- Graceful Restarts: Pods restart individually
- No PVC Deadlocks: emptyDir for Tailscale prevents binding issues
- Automatic Cleanup: Offline nodes removed by post-deployment script
- Health Checks: Each container can have its own liveness probe
- Isolated Failures: Container restart doesn't affect pod
- Uniform Pattern: All services deployed the same way
- Predictable Behavior: Know what to expect from any service
- Easy Debugging: Same troubleshooting steps for all services
- Reusable Configs: Templates work for any new service
- Clear Separation: Each container has single responsibility
- Easy Updates: Update application without changing TLS config
- Observable:
kubectl logs <pod> -c <container>for each part - Testable: Can test each container independently
Charon deploys two NGINX Ingress Controllers with different security profiles:
Purpose: Route traffic to services from VPN clients only
Configuration:
- Type: DaemonSet (one per node)
- Network:
hostNetwork: true(binds to node ports 80/443) - Tailscale: Integrated directly (no sidecar)
- Hostname:
nginx-ingress-<node-name> - Ingress Class:
nginx(default)
Access Control:
- Restricts to Tailscale IP range:
100.64.0.0/10 - All service Ingresses use this controller
- No public internet access
Purpose: Public access for Headscale VPN enrollment
Configuration:
- Type: LoadBalancer (public IP)
- Tailscale: Integrated directly
- Ingress Class:
nginx-external - Namespace:
ingress-nginx-external
Use Cases:
- Headscale web UI enrollment
- Initial VPN client registration
- Public-facing enrollment endpoint
Security:
- Only Headscale Ingress uses this controller
- All other services use internal controller
- Limited attack surface
resource "kubernetes_stateful_set" "my_service" {
count = var.my_service_enabled ? 1 : 0
metadata {
name = "my-service"
namespace = var.namespace
}
spec {
replicas = 1
service_name = "my-service"
selector {
match_labels = { app = "my-service" }
}
template {
metadata {
labels = { app = "my-service" }
}
spec {
# Container 1: NGINX TLS
container {
name = "nginx-tls"
image = "nginx:alpine"
port { container_port = 443 }
volume_mount {
name = "tls-certs"
mount_path = "/etc/nginx/certs"
}
volume_mount {
name = "nginx-conf"
mount_path = "/etc/nginx/conf.d"
}
}
# Container 2: Application
container {
name = "my-service"
image = "myapp:latest"
port { container_port = 8080 }
volume_mount {
name = "app-data"
mount_path = "/app/data"
}
}
# Container 3: Tailscale (if enabled)
dynamic "container" {
for_each = var.my_service_tailscale_enabled ? [1] : []
content {
name = "tailscale"
image = "tailscale/tailscale:latest"
security_context {
capabilities { add = ["NET_ADMIN"] }
}
env {
name = "TS_KUBE_SECRET"
value = "tailscale"
}
env {
name = "TS_HOSTNAME"
value = "my-service.example.com"
}
volume_mount {
name = "tailscale-state"
mount_path = "/var/lib/tailscale"
}
}
}
# Volumes
volume {
name = "tls-certs"
secret { secret_name = "wildcard-tls" }
}
volume {
name = "nginx-conf"
config_map { name = "my-service-nginx" }
}
volume {
name = "tailscale-state"
empty_dir {}
}
}
}
# Persistent storage
volume_claim_template {
metadata { name = "app-data" }
spec {
access_modes = ["ReadWriteOnce"]
resources {
requests = { storage = "10Gi" }
}
}
}
}
}# Get pod status
kubectl get pods -n dev my-service-0
# Check all containers
kubectl get pod my-service-0 -n dev -o jsonpath='{.status.containerStatuses[*].name}'
# Check if all containers are ready
kubectl get pod my-service-0 -n dev -o jsonpath='{.status.containerStatuses[*].ready}'# nginx-tls logs
kubectl logs -n dev my-service-0 -c nginx-tls
# Application logs
kubectl logs -n dev my-service-0 -c my-service
# Tailscale logs
kubectl logs -n dev my-service-0 -c tailscale
# Follow logs
kubectl logs -n dev my-service-0 -c my-service -f# Shell into application container
kubectl exec -n dev my-service-0 -c my-service -it -- /bin/sh
# Test nginx config
kubectl exec -n dev my-service-0 -c nginx-tls -- nginx -t
# Check Tailscale status
kubectl exec -n dev my-service-0 -c tailscale -- tailscale statusContainer won't start:
kubectl describe pod my-service-0 -n dev
# Look for ImagePullBackOff, CrashLoopBackOff, etc.Can't reach service:
# Check if Tailscale is connected
kubectl exec -n dev my-service-0 -c tailscale -- tailscale status
# Check nginx is listening
kubectl exec -n dev my-service-0 -c nginx-tls -- netstat -tlnp
# Test localhost connection
kubectl exec -n dev my-service-0 -c nginx-tls -- curl http://localhost:8080- Dependency Management - Terraform dependency patterns
- VPN Enrollment - Connect clients to VPN
- Adding Services - Deploy new services
- Troubleshooting - Common issues
Navigation: π Documentation Index | π Home