This guide covers deploying dstack on bare metal TDX hosts.
dstack can be deployed in two ways:
- Dev Deployment: All components run directly on the host. For local development and testing only - no security guarantees.
- Production Deployment: KMS and Gateway run as CVMs with hardware-rooted security. Uses auth server for authorization and OS image whitelisting. Required for any deployment where security matters.
Hardware:
- Bare metal TDX server (setup guide)
- At least 16GB RAM, 100GB free disk space
- Public IPv4 address
- Optional: NVIDIA H100 or Blackwell GPU for Confidential Computing workloads
Network:
- Domain with DNS access (for Gateway TLS)
Note: See Hardware Requirements for server recommendations.
This approach runs all components directly on the host for local development and testing.
Warning: Dev deployment uses KMS in dev mode with no security guarantees. Do NOT use for production.
# Ubuntu 24.04
sudo apt install -y build-essential chrpath diffstat lz4 wireguard-tools xorriso
# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | shgit clone https://github.com/Dstack-TEE/meta-dstack.git --recursive
cd meta-dstack/
mkdir build && cd build
../build.sh hostcfgEdit the generated build-config.sh for your environment. The minimal required changes are:
| Variable | Description |
|---|---|
KMS_DOMAIN |
DNS domain for KMS RPC (e.g., kms.ovh-tdx-dev.iex.ec) |
GATEWAY_DOMAIN |
DNS domain for Gateway RPC (e.g., gateway.ovh-tdx-dev.iex.ec) |
GATEWAY_PUBLIC_DOMAIN |
Public base domain for app routing (e.g., apps.ovh-tdx-dev.iex.ec) |
TLS Certificates:
The Gateway requires TLS certificates. Configure Certbot with Cloudflare:
CERTBOT_ENABLED=true
CF_API_TOKEN=<your-cloudflare-token>The certificates will be obtained automatically via ACME DNS-01 challenge. The KMS auto-generates its own certificates during bootstrap.
Other variables like ports and CID pool settings have sensible defaults.
vim ./build-config.sh
../build.sh hostcfg../build.sh dl 0.5.5The generated gateway.toml needs to be modified to skip attestation (since there's no guest agent in dev mode). Add the debug section with key_file = "" to skip debug certificate generation:
# Add to gateway.toml under [core] section:
cat >> gateway.toml << 'EOF'
[core.debug]
insecure_skip_attestation = true
key_file = ""
EOFNote: Setting
key_file = ""is required because the default config includeskey_file = "debug_key.json", which would cause the Gateway to fail if the file doesn't exist.
If you need to access KMS and Gateway from external machines (not just localhost), update the address bindings in the generated config files. The default build.sh generates configs with 127.0.0.1, which only allows local access.
For KMS (kms.toml):
# Edit kms.toml and change:
[rpc]
address = "0.0.0.0" # Change from "127.0.0.1" to allow external access
[core.onboard]
address = "0.0.0.0" # Change from "127.0.0.1" to allow external accessFor Gateway (gateway.toml):
# Edit gateway.toml and change:
address = "0.0.0.0" # Change from "127.0.0.1" to allow external accessNote: Binding to
0.0.0.0allows access from any network interface. For production, consider using firewall rules to restrict access. The proxylisten_addris already set to0.0.0.0by default.
The certificate CN must match the rpc_domain configured in gateway.toml:
# Get the rpc_domain from gateway.toml (e.g., gateway.ovh-tdx-dev.iex.ec)
RPC_DOMAIN=$(grep "^rpc_domain" gateway.toml | cut -d'"' -f2)
openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
-keyout certs/gateway-rpc.key \
-out certs/gateway-rpc.cert \
-subj "/CN=${RPC_DOMAIN}"
# Use KMS root CA as Gateway CA (created when KMS starts)
# This will be created automatically when you first run KMSStart KMS first (it generates certs/root-ca.crt):
./dstack-kms -c kms.tomlCopy the KMS CA cert for Gateway (in another terminal):
cp certs/root-ca.crt certs/gateway-ca.certThen start Gateway and VMM:
sudo ./dstack-gateway -c gateway.toml
sudo ./dstack-vmm -c vmm.tomlNote: This deployment uses KMS in dev mode without an auth server. For production deployments with proper security, see Production Deployment below.
| Service | URL | Notes |
|---|---|---|
| VMM Dashboard | http://localhost:<VMM_PORT> |
Web UI for deploying CVMs |
| KMS | https://localhost:<KMS_PORT> |
Use -k with curl (self-signed) |
| Gateway | https://localhost:<GATEWAY_PORT> |
Use -k with curl (self-signed) |
Note: Browsers will show certificate warnings for self-signed certs. Click "Advanced" → "Proceed" to continue.
Once your application is deployed in a CVM, you can access it through the dstack Gateway using the following URL format:
https://<app_id>[-<port>][<suffix>].<base_domain>:<gateway_port>
URL Format Components:
<app_id>: The application identifier (hexadecimal hash, e.g.,ebe1e087afff39e018e17a0a42f0be8622390782)<port>: Optional port number (defaults to 80 for HTTP, 443 for HTTPS)<suffix>: Optional suffix flags:s: Enable TLS passthrough (proxy passes encrypted traffic directly to backend)g: Enable HTTP/2 (gRPC) support
<base_domain>: The base domain configured for the Gateway (e.g.,apps.ovh-tdx-dev.iex.ec)<gateway_port>: The TLS proxy port where the Gateway listens for application traffic (configured ingateway.tomlunder[core.proxy] listen_port, e.g.,13644in dev mode,443in production). This is different from the Gateway RPC port (used for inter-service communication) and the WireGuard port (used for VPN tunnels).
Examples:
# Basic HTTP access (port 80)
curl -k https://ebe1e087afff39e018e17a0a42f0be8622390782.apps.ovh-tdx-dev.iex.ec:13644/
# Custom port (e.g., 8080)
curl -k https://ebe1e087afff39e018e17a0a42f0be8622390782-8080.apps.ovh-tdx-dev.iex.ec:13644/
# TLS passthrough
curl -k https://ebe1e087afff39e018e17a0a42f0be8622390782-443s.apps.ovh-tdx-dev.iex.ec:13644/
# HTTP/2 (gRPC)
curl -k https://ebe1e087afff39e018e17a0a42f0be8622390782-8080g.apps.ovh-tdx-dev.iex.ec:13644/Finding Your App ID:
The app ID is generated when you deploy your application. You can find it:
- In the VMM dashboard after deployment
- In the Gateway logs when the app registers:
grep RegisterCvm gateway.log - Via the VMM API:
curl http://localhost:<vmm_port>/vm
Notes:
- Use
-kflag with curl to skip certificate verification (required for self-signed certificates in dev mode) - The Gateway automatically routes traffic to the correct CVM instance based on the app ID
- Multiple instances of the same app are load-balanced automatically
For more details, see the Usage Guide.
For production, deploy KMS and Gateway as CVMs with hardware-rooted security. Production deployments require:
- KMS running in a CVM (not on the host)
- Auth server for authorization (webhook mode)
Required:
- Set up TDX host with dstack-vmm
- Deploy KMS as CVM (with auth server)
- Deploy Gateway as CVM
Optional Add-ons:
- Zero Trust HTTPS
- Certificate Transparency monitoring
- Multi-node deployment
- On-chain governance - Smart contract-based authorization
Clone and build dstack-vmm:
git clone https://github.com/Dstack-TEE/dstack
cd dstack
cargo build --release -p dstack-vmm -p supervisor
mkdir -p vmm-data
cp target/release/dstack-vmm vmm-data/
cp target/release/supervisor vmm-data/
cd vmm-data/Create vmm.toml:
address = "tcp:0.0.0.0:9080"
reuse = true
image_path = "./images"
run_path = "./run/vm"
[cvm]
kms_urls = []
gateway_urls = []
cid_start = 30000
cid_pool_size = 1000
[cvm.port_mapping]
enabled = true
address = "127.0.0.1"
range = [
{ protocol = "tcp", from = 1, to = 20000 },
{ protocol = "udp", from = 1, to = 20000 },
]
[host_api]
address = "vsock:2"
port = 10000
[key_provider]
enabled = true
address = "127.0.0.1"
port = 3443Download guest images from meta-dstack releases and extract to ./images/.
For reproducible builds and verification, see the Security Model.
Start VMM:
./dstack-vmm -c vmm.tomlProduction KMS requires:
- KMS: The key management service inside a CVM
- Auth server: Webhook server that validates boot requests and returns authorization decisions
| Server | Use Case | Configuration |
|---|---|---|
| auth-simple | Config-file-based whitelisting | JSON config file |
| auth-eth | On-chain governance via smart contracts | Ethereum RPC + contract |
| Custom | Your own authorization logic | Implement webhook interface |
All auth servers implement the same webhook interface:
GET /- Health checkPOST /bootAuth/app- App boot authorizationPOST /bootAuth/kms- KMS boot authorization
auth-simple validates boot requests against a JSON config file.
Create auth-config.json for initial KMS deployment:
{
"osImages": ["0x<os-image-hash>"],
"kms": { "allowAnyDevice": true },
"apps": {}
}Run auth-simple:
cd kms/auth-simple
bun install
PORT=3001 AUTH_CONFIG_PATH=/home/aghiles/dstack/kms/auth-simple/auth-config.json bun run startFor adding Gateway, apps, and other config fields, see auth-simple Operations Guide.
For decentralized governance via smart contracts, see On-Chain Governance.
The OS image hash is in the digest.txt file inside the guest image tarball:
# Extract hash from release tarball
tar -xzf dstack-0.5.5.tar.gz
cat dstack-0.5.5/digest.txt
# Output: 0b327bcd642788b0517de3ff46d31ebd3847b6c64ea40bacde268bb9f1c8ec83Add 0x prefix for auth-simple config: 0x0b327bcd...
Important: The KMS CVM uses a local SGX Key Provider to obtain its sealing keys. This service must be running before deploying KMS.
The Key Provider is an SGX enclave that:
- Derives sealing keys from SGX hardware measurements
- Provides keys to CVMs after validating their TDX quotes
- Runs on port 3443
Start the Key Provider:
cd dstack/key-provider-build/
docker compose up -dVerify it's running:
docker ps | grep key-provider
# Should show: gramine-sealing-key-providerNote: The Key Provider requires SGX hardware (
/dev/sgx_enclave,/dev/sgx_provision). Verify SGX is available withls /dev/sgx*.
Choose the deployment script based on your auth server:
For auth-simple (external webhook):
auth-simple runs on your infrastructure, outside the CVM.
cd dstack/kms/dstack-app/Edit .env.simple:
VMM_RPC=http://127.0.0.1:9080
AUTH_WEBHOOK_URL=http://10.0.2.2:3001 # Auth server address (address of the qemu gateway since we use user mode (slirp) connection)
KMS_RPC_ADDR=0.0.0.0:9201
GUEST_AGENT_ADDR=127.0.0.1:9205
OS_IMAGE=dstack-0.5.6
IMAGE_DOWNLOAD_URL=https://download.dstack.org/os-images/mr_{OS_IMAGE_HASH}.tar.gz
VERIFY_IMAGE=true
KMS_IMAGE=dstacktee/dstack-kms@sha256:6f8ae87eb685679bf77844b38fdda867dd591a6470543574b92ab9b71bf4c849Then run:
./deploy-simple.shFor auth-eth (on-chain governance):
See On-Chain Governance Guide for deploying KMS with smart contract-based authorization.
Monitor startup:
tail -f ../../vmm-data/run/vm/<vm-id>/serial.logWait for [ OK ] Finished App Compose Service.
Open http://127.0.0.1:9201/ in your browser.
- Click Bootstrap
- Enter the domain for your KMS (e.g.,
kms.ovh-tdx-dev.iex.ec) - Click Finish setup
The KMS will display its public key and TDX quote:
Before deploying Gateway:
- Register the Gateway app in your auth server config (add to
appssection inauth-config.json) - Note the App ID you assign - you'll need it for the
.envfile
For on-chain governance, see On-Chain Governance for registration steps.
cd dstack/gateway/dstack-app/
./deploy-to-vmm.shEdit .env with required variables:
# VMM connection (use TCP if VMM is on same host, or remote URL)
VMM_RPC=http://127.0.0.1:9080
# Optional: Cloudflare API token for Let's Encrypt DNS-01 challenge
# If not set, Gateway will use self-signed certificates
# CF_API_TOKEN=your_cloudflare_api_token
# Domain configuration
SRV_DOMAIN=ovh-tdx-dev.iex.ec
PUBLIC_IP=$(curl -s ifconfig.me)
# Gateway app ID (from registration above)
GATEWAY_APP_ID=32467b43BFa67273FC7dDda0999Ee9A12F2AaA08
# KMS URL (the KMS must be running and accessible)
KMS_URL=https://127.0.0.1:9201
# Gateway URLs
MY_URL=https://gateway.example.com:9202
BOOTNODE_URL=https://gateway.example.com:9202
# WireGuard (uses same port as RPC)
WG_ADDR=0.0.0.0:9202
# Network settings
SUBNET_INDEX=0
ACME_STAGING=no # Set to 'yes' for testing
OS_IMAGE=dstack-0.5.5Note on hex formats:
- Gateway
.envfile: Use raw hex without0xprefix (e.g.,GATEWAY_APP_ID=32467b43...) - auth-simple config: Use
0xprefix (e.g.,"0x32467b43..."). The server normalizes both formats.
Run the script again:
./deploy-to-vmm.shThe script will display the compose file and compose hash, then prompt for confirmation:
Docker compose file:
...
Compose hash: 0x700a50336df7c07c82457b116e144f526c29f6d8...
Configuration:
...
Continue? [y/N]
Before pressing 'y', add the compose hash to your auth server whitelist:
- For auth-simple: Add to
composeHashesarray inauth-config.json - For auth-eth: Use
app:add-hash(see On-Chain Governance)
Then return to the first terminal and press 'y' to deploy.
After Gateway is running, update vmm.toml with KMS and Gateway URLs:
[cvm]
kms_urls = ["https://kms.ovh-tdx-dev.iex.ec:9201"]
gateway_urls = ["https://gateway.ovh-tdx-dev.iex.ec:9202"]Restart dstack-vmm to apply changes.
Generate TLS certificates inside the TEE with automatic CAA record management.
Configure in build-config.sh:
GATEWAY_CERT=${CERTBOT_WORKDIR}/live/cert.pem
GATEWAY_KEY=${CERTBOT_WORKDIR}/live/key.pem
CF_API_TOKEN=<your-cloudflare-token>
ACME_URL=https://acme-v02.api.letsencrypt.org/directoryRun certbot:
RUST_LOG=info,certbot=debug ./certbot renew -c certbot.tomlThis will:
- Create an ACME account
- Set CAA DNS records on Cloudflare
- Request and auto-renew certificates
Monitor for unauthorized certificates issued to your domain.
cargo build --release -p ct_monitor
./target/release/ct_monitor \
--gateway-uri https://<gateway-domain> \
--domain <your-domain>How it works:
- Fetches known public keys from Gateway (
/acme-infoendpoint) - Queries crt.sh for certificates issued to your domain
- Verifies each certificate's public key matches the known keys
- Logs errors (❌) when certificates are issued to unknown public keys
The monitor runs in a loop, checking every 60 seconds. Integrate with your alerting system by monitoring stderr for error messages.
Scale by adding VMM nodes and KMS replicas for high availability.
On each additional TDX host:
- Set up dstack-vmm (see step 1)
- Configure
vmm.tomlwith existing KMS/Gateway URLs - Start VMM
[cvm]
kms_urls = ["https://kms.example.com:9201"]
gateway_urls = ["https://gateway.example.com:9202"]Additional KMS instances can onboard from an existing KMS to share the same root keys. This enables:
- High availability (multiple KMS nodes)
- Geographic distribution
- Load balancing
How it works:
- New KMS starts in onboard mode (empty
auto_bootstrap_domain) - New KMS calls
GetTempCaCerton source KMS - New KMS generates RA-TLS certificate with TDX quote
- New KMS calls
GetKmsKeywith mTLS authentication - Source KMS verifies attestation via
bootAuth/kmswebhook - If approved, source KMS returns root keys
- Both KMS instances now derive identical keys
Configure new KMS for onboarding:
[core.onboard]
enabled = true
auto_bootstrap_domain = "" # Empty = onboard mode
quote_enabled = true # Require TDX attestation
address = "0.0.0.0"
port = 9203 # HTTP port for onboard UITrigger onboard via API:
curl -X POST http://<new-kms>:9203/prpc/Onboard.Onboard?json \
-H "Content-Type: application/json" \
-d '{"source_url": "https://<existing-kms>:9201/prpc", "domain": "kms2.example.com"}'Finish and restart:
curl http://<new-kms>:9203/finish
# Restart KMS - it will now serve as a full KMS with shared keysNote: For KMS onboarding with
quote_enabled = true, add the KMS mrAggregated hash to your auth server'skms.mrAggregatedwhitelist.
After setup, deploy apps via the VMM dashboard or CLI.
Before deploying, register your app in your auth server:
- For auth-simple: See auth-simple Operations Guide
- For auth-eth: See On-Chain Governance
Open http://localhost:9080:
- Select the OS image
- Enter the App ID (from registration above)
- Upload your
docker-compose.yaml
After startup, click Dashboard to view:
The CID range conflicts with existing VMs.
- Find used CIDs:
ps aux | grep 'guest-cid=' - Update
vmm.toml:[cvm] cid_start = 33000 cid_pool_size = 1000
When running Gateway with many concurrent connections (>100K), the host's conntrack table may fill up, causing silent packet drops:
dmesg: nf_conntrack: table full, dropping packet
Each proxied connection creates multiple conntrack entries (client→gateway, gateway→WireGuard→backend). The default nf_conntrack_max (typically 262,144) is insufficient for high-concurrency gateways.
Fix:
# Check current limit
sysctl net.netfilter.nf_conntrack_max
# Increase for production (persistent)
echo "net.netfilter.nf_conntrack_max = 1048576" >> /etc/sysctl.d/99-dstack.conf
echo "net.netfilter.nf_conntrack_buckets = 262144" >> /etc/sysctl.d/99-dstack.conf
sysctl -p /etc/sysctl.d/99-dstack.confAlso increase inside bridge-mode CVMs if they handle many connections:
sysctl -w net.netfilter.nf_conntrack_max=524288Sizing rule of thumb: Set nf_conntrack_max to at least 4× your target concurrent connection count (each connection may use 2-3 conntrack entries across NAT/bridge layers).
Ubuntu 23.10+ restricts unprivileged user namespaces:
sudo sysctl kernel.apparmor_restrict_unprivileged_userns=0


