yoq is a single Linux binary for building, running, networking, and deploying containers without stitching together Docker, Compose, Kubernetes, Istio, Helm, and a pile of glue.
Most teams do not need a platform made of separate control planes, YAML layers, sidecars, and operators just to run a few services reliably. They need containers, service discovery, rollouts, secrets, TLS, metrics, and a sane deployment model that one engineer can actually understand end to end.
That is the point of yoq. It collapses the usual stack into one operational model, one CLI, one state store, and one binary you can ship to a Linux host. Instead of outsourcing core behavior to half a dozen daemons, it builds directly on Linux primitives like namespaces, cgroups v2, io_uring, eBPF, and WireGuard.
Linux kernel 6.1+ is required.
yoq is a strong fit for:
- small-to-medium teams that want production features without building a platform team first
- multi-service applications that outgrew Compose but do not need the ecosystem breadth of Kubernetes
- operators who prefer direct, inspectable systems over layered abstractions
- Linux environments where shipping one binary is operationally attractive
yoq takes a different approach from the standard stack. instead of composing separate tools for images, runtime, orchestration, ingress, mesh, secrets, and observability, it keeps everything integrated with a small surface area. the tradeoff is a narrower ecosystem — you get one coherent system instead of a platform you can extend in every direction.
Kubernetes has a vast ecosystem and years of production hardening. yoq doesn't try to replace that. if you already depend on the full Kubernetes ecosystem surface, deep CRD-driven workflows, or broad vendor tooling built specifically around Kubernetes APIs, yoq is probably not the right fit.
- isolated containers with PID, NET, MNT, UTS, IPC, USER, and CGROUP namespaces
- cgroups v2 resource limits, overlayfs root filesystems, seccomp filters, and capability dropping
- process supervision, log capture, restart handling, and
execinto running containers
- Dockerfile support for the major directives, including multi-stage builds and build args
- content-hash caching so unchanged build steps are not re-executed
- optional TOML build manifest format
- declarative multi-service manifests
- dependency ordering, workers, cron jobs, and dev mode with hot restart
- health checks, readiness probes, rollout history, rollback, and automatic rollback on failed updates
- per-container IPs on a bridge network
- built-in DNS-based service discovery
- port mapping, outbound NAT, and eBPF-based load balancing and policy enforcement where available
- WireGuard-based cluster networking for multi-node deployments
- HTTP routing for HTTP/1.1, prior-knowledge HTTP/2 (
h2c), and TLS-terminated HTTP/2 via ALPN when a routed host is also bound toservice.<name>.tls.domain
current gRPC routing limits:
- direct listener traffic still uses prior-knowledge
h2c; TLS/ALPN HTTP/2 routing works through the TLS terminator when the routed host matches a servicetls.domain - one accepted client connection stays pinned to the first matched routed service
request_timeout_mscurrently acts as the idle timeout for routed HTTP/2 connections
- encrypted secrets store with rotation
- TLS termination with ACME provisioning and renewal
- service and pairwise network metrics
- policy controls between services
- status and resource reporting
current ACME/TLS limits:
- HTTP-01 challenge validation only
- port 80 on the target host must be reachable during provision and renewal
- the standalone CLI flow currently requires
--emailfor bothyoq cert provisionandyoq cert renew
- GPU detection and passthrough into containers
- gang scheduling for distributed training workloads
- NCCL mesh configuration and InfiniBand/RDMA support
- MIG partitioning and MPS sharing
- training job orchestration with checkpoints, fault tolerance, and data sharding
- S3-compatible object storage gateway
- volume drivers: local, host, NFS, parallel filesystem
- raft-based server nodes with SQLite-backed state replication
- SWIM gossip protocol for scalable failure detection
- role separation: server nodes (raft + API + scheduler) vs agent nodes (gossip + workloads)
- HMAC-SHA256 authenticated cluster transport
- agent registration, heartbeats, placement, drain, and cluster status
- rolling upgrades with leader step-down
- remote operations via
--server host:port
yoq doctorpre-flight system checks (kernel, cgroups, eBPF, GPU, WireGuard, InfiniBand, disk)yoq backup/yoq restorefor SQLite state
- threshold-based alerts (CPU, memory, restart count, p99 latency, error rate) with webhook notifications
- Linux kernel 6.1+ (sorry no Mac support)
- Zig 0.15.2
make buildFor GPU-focused validation without running the full suite, use zig build test-gpu. For a real-host smoke checklist, see docs/gpu-validation.md.
For a temporary 5-node GCP validation rig that exercises cluster networking and GPU hosts, see docs/gcp-cluster-validation.md.
For the canonical operator evaluation flow across local runtime, HTTP routing, and clustered deployment, see docs/golden-path.md.
For cluster bootstrap, day-2 operations, and failure drills, see docs/cluster-guide.md.
curl -fsSL https://yoq.dev/install | bashyoq run alpine:latest echo "hello from yoq"
yoq ps
yoq logs <id-or-name>[service.redis]
image = "redis:7"
ports = ["6379:6379"]
[service.web]
image = "nginx:latest"
ports = ["8080:80"]
depends_on = ["redis"]yoq up -f manifest.toml
yoq status
yoq down -f manifest.tomlyoq run <image|rootfs> [command] run a container
yoq ps [--json] list containers
yoq stop <id|name> stop a container
yoq rm <id|name> remove a stopped container
yoq logs <id|name> [--tail N] show container output
yoq restart <id|name> restart a container
yoq exec <id|name> <cmd> [args...] run a command in a container
yoq pull <image> pull from a registry
yoq push <source> [target] push to a registry
yoq images [--json] list local images
yoq inspect <image> show image metadata
yoq rmi <image> remove an image
yoq prune [--json] delete unreferenced blobs and layers
yoq build [-t tag] [-f Dockerfile] . build an image
[--format toml] build from a TOML manifest
yoq up [-f manifest.toml] start services from a manifest
yoq up [service...] start named services and dependencies
yoq up --dev watch and hot-restart on changes
yoq up --server host:port deploy to a cluster
yoq down [-f manifest.toml] stop services from a manifest
yoq run-worker <name> run a one-shot worker
yoq init [-f path] scaffold a manifest
yoq validate [-f manifest.toml] [-q] validate a manifest
yoq rollback <service> roll back a deployment
yoq history <service> show deployment history
yoq status [--verbose] show service status and resources
yoq metrics [service] show service metrics
yoq metrics --pairs show service-to-service metrics
yoq policy deny <src> <tgt> block traffic between services
yoq policy allow <src> <tgt> allow traffic between services
yoq policy rm <src> <tgt> remove a policy rule
yoq policy list list policy rules
yoq secret set <name> <value> store a secret
yoq secret get <name> read a secret
yoq secret rm <name> delete a secret
yoq secret list list secrets
yoq secret rotate <name> rotate a secret
yoq cert provision <domain> --email <email> [--staging]
provision a TLS certificate via ACME
yoq cert renew <domain> --email <email> [--staging]
renew a TLS certificate via ACME
yoq cert install <domain> --cert <path> --key <path>
yoq cert list list certificates
yoq cert rm <domain> remove a certificate
yoq serve [--port PORT] [--http-proxy-bind ADDR] [--http-proxy-port PORT]
start the API server
yoq init-server [--id N] [--port P] start a cluster server node
[--api-port P] [--peers ...]
[--token TOKEN] [--http-proxy-bind ADDR]
[--http-proxy-port PORT]
yoq join <host> --token <token> join as an agent node
yoq cluster status show cluster health
yoq nodes [--server host:port] list agent nodes
yoq drain <id> [--server host:port] drain an agent node
yoq gpu topo [--json] show GPU topology
yoq gpu bench [--gpus N] GPU-to-GPU bandwidth benchmark
[--size BYTES] [--iterations N]
yoq train start <name> start a training job
yoq train status <name> show training job status
yoq train stop <name> stop a training job
yoq train pause <name> pause a training job
yoq train resume <name> resume a paused job
yoq train scale <name> scale training ranks
yoq train logs <name> [--rank N] show logs for a training rank
yoq doctor [--json] check system readiness
yoq backup [--output path] backup database state
yoq restore <path> restore database from backup
yoq version [--json] print version
yoq help show help
yoq completion <bash|zsh|fish> output shell completion
Notes:
--jsonis available onps,images,prune,version,gpu topo, anddoctor.- crons defined in the manifest start automatically with
yoq up. - deployment, metrics, and certificate commands also support
--server host:port.
~77K lines of Zig, ~1474 tests, v0.1.1. coverage across runtime, images, networking, build, manifests, clustering, GPU, training, storage, secrets, TLS, metrics, and alerting.
see docs/architecture.md for subsystem details and docs/users-guide.md for a guide to the internals.
yoq is organized as a set of integrated subsystems:
runtime/— container lifecycle, namespaces, cgroups, filesystem, security, logs, execimage/— OCI registry, blob storage, layer extraction, metadatanetwork/— bridge networking, DNS, NAT, WireGuard, eBPF, policybuild/andmanifest/— image builds, manifests, orchestration, health, updates, training, alertingcluster/,api/, andstate/— replication, scheduling, remote control, persistent state, backup/restoregpu/— detection, passthrough, health, scheduling, InfiniBand/NCCL meshstorage/— S3-compatible object storage, volume managementtls/andlib/— certificates, proxying, utilities, CLI, logging, doctor
see docs/architecture.md for the full breakdown.
The examples/ directory has ready-to-use manifests:
examples/redis/for the simplest possible single-service setupexamples/web-app/for a multi-service app with postgres, redis, workers, and health checksexamples/cron/for scheduled jobs withevery = "1h"examples/http-routing/for host-, path-, and header-based HTTP routingexamples/cluster/for a minimal multi-node cluster flow- docs/golden-path.md for the recommended end-to-end evaluation workflow
yoq up -f examples/redis/manifest.toml- richer HTTP routing — broader ingress policy and more advanced traffic shaping
- hardening — continued stability, edge-case testing, and operational polish
- web UI remains intentionally deferred; the CLI is the primary interface
- image signing is not built in; use cosign externally