Welcome to the VMM CLI! This tool helps you manage CVMs in the dstack platform.
- Getting Started
- Basic Commands
- VM Management
- Application Deployment
- Security Features
- Advanced Usage
- Troubleshooting
Before using the VMM CLI, ensure you have:
- Python 3.6 or higher installed
- Access to a dstack-vmm server
- Required Python packages (cryptography, eth_keys, eth_utils)
The VMM CLI is a single Python script (vmm-cli.py) that you can run directly:
./vmm-cli.py --helpBy default, the CLI connects to http://localhost:8080. You can configure the server URL in several ways:
Set the DSTACK_VMM_URL environment variable:
# Set for current shell session
export DSTACK_VMM_URL=http://your-server:8080
./vmm-cli.py lsvmOverride the environment variable or default with --url:
./vmm-cli.py --url http://your-server:8080 <command>For local Unix socket connections:
# Via environment variable
export DSTACK_VMM_URL=unix:/path/to/socket
# Via command line
./vmm-cli.py --url unix:/path/to/socket <command>Priority Order: Command line --url > DSTACK_VMM_URL environment variable > default http://localhost:8080
If your dstack-vmm server requires authentication, you can provide credentials using:
# Set authentication credentials
export DSTACK_VMM_AUTH_USER=your-username
export DSTACK_VMM_AUTH_PASSWORD=your-password
# Then use CLI normally
./vmm-cli.py lsvm./vmm-cli.py --auth-user your-username --auth-password your-password lsvmNote: Environment variables take precedence over command line arguments for authentication.
View all your VMs and their current status:
# Basic list
./vmm-cli.py lsvm
# Detailed view with configuration info
./vmm-cli.py lsvm -vThis shows VM ID, App ID, Name, Status, and Uptime. The verbose mode adds vCPU, Memory, Disk, Image, and GPU assignment information.
See what VM images you can deploy:
./vmm-cli.py lsimageCheck what GPU resources are available:
./vmm-cli.py lsgpuThis command shows available GPU slots, their descriptions, and availability status. GPU information is also displayed in the GPUs column when using ./vmm-cli.py lsvm -v.
When using the verbose list command (lsvm -v), the GPUs column will show:
- "All GPUs" - when the VM is configured with
--ppciemode (all available GPUs) - "0a:00.0, 1a:00.0" - specific GPU slot assignments when using
--gpuflags - "-" - when no GPUs are assigned to the VM
Example output:
┌──────────────────────┬─────────┬──────────┬─────────┬─────────┬──────┬─────────┬───────┬─────────────┬────────────────────┐
│ VM ID │ App ID │ Name │ Status │ Uptime │ vCPU │ Memory │ Disk │ Image │ GPUs │
├──────────────────────┼─────────┼──────────┼─────────┼─────────┼──────┼─────────┼───────┼─────────────┼────────────────────┤
│ abc123... │ xyz789 │ ml-model │ running │ 2h 30m │ 8 │ 32GB │ 500GB │ dstack-0.5.3│ 18:00.0, 9a:00.0 │
│ def456... │ uvw012 │ web-app │ running │ 1h 15m │ 2 │ 4GB │ 50GB │ dstack-0.5.3│ - │
│ ghi789... │ rst345 │ ai-train │ running │ 45m │ 16 │ 64GB │ 1TB │ dstack-0.5.3│ All GPUs │
└──────────────────────┴─────────┴──────────┴─────────┴─────────┴──────┴─────────┴───────┴─────────────┴────────────────────┘
# Start a VM
./vmm-cli.py start <vm-id>
# Gracefully stop a VM
./vmm-cli.py stop <vm-id>
# Force stop a VM
./vmm-cli.py stop -f <vm-id>Monitor your VM's output:
# Show last 20 lines of logs
./vmm-cli.py logs <vm-id>
# Show last 50 lines
./vmm-cli.py logs <vm-id> -n 50
# Follow logs in real-time (like tail -f)
./vmm-cli.py logs <vm-id> -fPress Ctrl+C to stop following logs.
When you're done with a VM:
./vmm-cli.py remove <vm-id>Deploying applications involves two main steps: creating an app compose file and deploying the VM.
First, create an application composition file that describes your application:
./vmm-cli.py compose \
--name "my-web-app" \
--docker-compose ./docker-compose.yml \
--output ./app-compose.json--name: Friendly name for your application--docker-compose: Path to your Docker Compose file--prelaunch-script: Optional script to run before starting containers--kms: Enable Key Management Service for secrets--gateway: Enable dstack-gateway for external access--local-key-provider: Use local key provider--public-logs: Make logs publicly accessible--public-sysinfo: Make system info publicly accessible--env-file: File with environment variables to encrypt--no-instance-id: Disable unique instance identification
./vmm-cli.py compose \
--name "secure-app" \
--docker-compose ./docker-compose.yml \
--kms \
--gateway \
--env-file ./secrets.env \
--output ./app-compose.jsonDeploy your application with the compose file:
./vmm-cli.py deploy \
--name "my-app-vm" \
--image "dstack-0.5.3" \
--compose ./app-compose.json \
--vcpu 2 \
--memory 2G \
--disk 50G--name: VM instance name--image: Base VM image to use--compose: Path to your app-compose.json file--vcpu: Number of virtual CPUs (default: 1)--memory: Memory size (e.g., 1G, 512M, 2048M)--disk: Disk size (e.g., 20G, 100G)--port: Port mappings (see Port Mapping section)--gpu: GPU assignments--ppcie: Enable PPCIE mode (attach ALL available GPUs and NVSwitches)--env-file: Environment variables file--user-config: Path to user config file (will be placed at/dstack/.host-shared/.user-configin the CVM)--kms-url: KMS server URL--gateway-url: Gateway server URL--stopped: Create VM in stopped state (requires dstack-vmm >= 0.5.4)
Expose services running in your VM:
# Format: protocol:host_port:vm_port
--port tcp:8080:80
# Format: protocol:host_address:host_port:vm_port
--port tcp:127.0.0.1:8080:80
# Multiple ports
--port tcp:8080:80 --port tcp:8443:443The VMM CLI supports two GPU attachment modes:
Assign individual GPUs by their slot identifiers:
# Single GPU
--gpu "0a:00.0"
# Multiple specific GPUs
--gpu "0a:00.0" --gpu "1a:00.0" --gpu "2a:00.0"You can find the slot identifiers by running ./vmm-cli.py lsgpu.
To run the CVM in PPCIE mode, use the --ppcie flag. This will attach ALL available GPUs and NVSwitches to the CVM.
# Enable PPCIE (Protected PCIe) mode - attach ALL available GPUs and NVSwitches
--ppcieImportant Notes:
--ppcietakes precedence over individual--gpuspecifications- Use
./vmm-cli.py lsgputo see available GPU slots before assignment - PPCIE mode (
--ppcie) provides the best performance for GPU-intensive workloads in a CVM
After successful deployment, verify your VM is running correctly:
# Check if your VM appears in the list
./vmm-cli.py lsvm -v
# Monitor the startup process
./vmm-cli.py logs <vm-id> -f# Connect to local VMM instance
export DSTACK_VMM_URL=http://127.0.0.1:12000
# If authentication is required
export DSTACK_VMM_AUTH_USER=your-username
export DSTACK_VMM_AUTH_PASSWORD=your-password
# Create a basic docker-compose.yml
cat > docker-compose.yml << 'EOF'
version: '3.8'
services:
web:
image: nginx:alpine
ports:
- "80:80"
EOF
# Create app compose file
./vmm-cli.py compose \
--name "test-webapp" \
--docker-compose ./docker-compose.yml \
--output ./app-compose.json
# Deploy the VM
./vmm-cli.py deploy \
--name "test-vm" \
--image "dstack-dev-0.5.3" \
--compose ./app-compose.json \
--vcpu 2 \
--memory 4G \
--disk 30G
# Verify deployment
./vmm-cli.py lsvm -v./vmm-cli.py deploy \
--name "web-server" \
--image "dstack-0.5.3" \
--compose ./app-compose.json \
--vcpu 4 \
--memory 4G \
--disk 100G \
--port tcp:8080:80 \
--port tcp:8443:443 \
--gpu "0" --gpu "1" \
--env-file ./production.env \
--kms-url http://kms-server:9000# Set VMM URL via environment
export DSTACK_VMM_URL=http://ml-cluster:8080
# Deploy with all GPUs in PPCIE mode
./vmm-cli.py deploy \
--name "ml-training" \
--image "pytorch:latest" \
--compose ./ml-app-compose.json \
--vcpu 16 \
--memory 32G \
--disk 500G \
--ppcie \
--hugepages \
--pin-numa \
--env-file ./ml-secrets.env# Create a user configuration file
cat > user-config.json << EOF
{
"timezone": "UTC",
"locale": "en_US.UTF-8",
"custom_settings": {
"debug_mode": false,
"log_level": "INFO"
}
}
EOF
# Deploy VM in stopped state with user config
./vmm-cli.py deploy \
--name "configured-vm" \
--image "dstack-0.5.4" \
--compose ./app-compose.json \
--vcpu 4 \
--memory 8G \
--disk 100G \
--user-config ./user-config.json \
--stopped
# The VM is created but not started - start it manually when ready
./vmm-cli.py start configured-vmNote: The --stopped flag is useful for:
- Pre-staging VMs for later use
- Preparing VMs with specific configurations before startup
The VMM CLI automatically encrypts sensitive environment variables before sending them to the server.
Create a secrets.env file with your variables:
# secrets.env
DATABASE_URL=postgresql://user:pass@db:5432/myapp
API_KEY=secret-api-key-12345
JWT_SECRET=your-jwt-secret-hereLines starting with # are ignored as comments.
# During compose creation
./vmm-cli.py compose \
--name "my-app" \
--docker-compose ./docker-compose.yml \
--env-file ./secrets.env \
--kms \
--output ./app-compose.json
# During deployment
./vmm-cli.py deploy \
--name "my-app-vm" \
--image "dstack-0.5.3" \
--compose ./app-compose.json \
--env-file ./secrets.envKMS provides secure key management and CVM execution verification.
Manage trusted KMS public keys for enhanced security:
# List current trusted KMS public keys
./vmm-cli.py kms list
# Add a trusted KMS public key
./vmm-cli.py kms add 0x1234567890abcdef...
# Remove a trusted KMS public key
./vmm-cli.py kms remove 0x1234567890abcdef...The whitelist is stored in ~/.dstack-vmm/kms-whitelist.json.
./vmm-cli.py update-env <vm-id> --env-file ./new-secrets.env./vmm-cli.py update-app-compose <vm-id> ./new-app-compose.json./vmm-cli.py update-user-config <vm-id> ./new-config.jsonUpdate port mappings for an existing VM:
./vmm-cli.py update-ports <vm-id> --port tcp:8080:80 --port tcp:8443:443Use the all-in-one update command to update multiple VM aspects in a single operation:
# Update resources (requires VM to be stopped)
./vmm-cli.py update <vm-id> \
--vcpu 4 \
--memory 8G \
--disk 100G \
--image "dstack-0.5.4"
# Update application configuration
./vmm-cli.py update <vm-id> \
--compose ./new-docker-compose.yml \
--prelaunch-script ./setup.sh \
--swap 4G \
--env-file ./new-secrets.env \
--user-config ./new-config.json
# Update networking and GPU
./vmm-cli.py update <vm-id> \
--port tcp:8080:80 \
--port tcp:8443:443 \
--gpu "18:00.0" --gpu "2a:00.0"
# Detach all GPUs from a VM
./vmm-cli.py update <vm-id> --no-gpus
# Remove all port mappings from a VM
./vmm-cli.py update <vm-id> --no-ports
# Update everything at once
./vmm-cli.py update <vm-id> \
--vcpu 8 \
--memory 16G \
--disk 200G \
--compose ./new-docker-compose.yml \
--prelaunch-script ./init.sh \
--swap 8G \
--env-file ./new-secrets.env \
--port tcp:8080:80 \
--ppcieAvailable Options:
- Resource changes (requires VM to be stopped):
--vcpu,--memory,--disk,--image - Application updates:
--compose(docker-compose file),--prelaunch-script,--swap,--env-file,--user-config - Networking (mutually exclusive):
--port <mapping>(can be used multiple times)--no-ports(remove all port mappings)- No port flag: port configuration remains unchanged
- GPU (mutually exclusive):
--gpu <slot>(can be used multiple times for specific GPUs)--ppcie(attach all available GPUs)--no-gpus(detach all GPUs)- No GPU flag: GPU configuration remains unchanged
- KMS:
--kms-url(for environment encryption)
Notes:
- Resource changes (vCPU, memory, disk, image) require the VM to be stopped
- Application updates can be applied to running VMs
- Port and GPU options are mutually exclusive within their groups
- If no flag is specified for ports or GPUs, those configurations remain unchanged
For better performance on multi-socket systems:
./vmm-cli.py deploy \
--name "high-perf-vm" \
--image "dstack-0.5.3" \
--compose ./app-compose.json \
--pin-numaEnable huge pages for memory-intensive applications:
./vmm-cli.py deploy \
--name "memory-intensive-vm" \
--image "dstack-0.5.3" \
--compose ./app-compose.json \
--hugepagesThe CLI accepts human-readable size formats:
1Gor1GB= 1024 MB512Mor512MB= 512 MB2Tor2TB= 2,097,152 MB
50Gor50GB= 50 GB1Tor1TB= 1024 GB
- Check VM logs:
./vmm-cli.py logs <vm-id> - Verify image exists:
./vmm-cli.py lsimage - Check resource availability:
./vmm-cli.py lsgpu
Ensure ports are not already in use:
# Check if port is available
netstat -tuln | grep :8080- Use
--helpwith any command for detailed options - Check the server logs for additional error information
- Verify your Docker Compose file is valid before creating the app compose
- Use
./vmm-cli.py lsgputo see available GPU slots and their status - Set
DSTACK_VMM_URLenvironment variable to avoid typing--urlrepeatedly
- Check server URL and connectivity
- Verify server is running and accessible
- Add the signer to your trusted whitelist
- Or confirm to proceed with untrusted signer
- Use
./vmm-cli.py lsvmto verify VM ID - Check if VM was removed