🚀 Paradigm shift in hypervisor design through systemd's advanced virtualization capabilities and git-ops-native architecture.
| Section | Description | Focus Area |
|---|---|---|
| 📖 Introduction | System overview and paradigm shift | Conceptual foundation |
| 🎯 Design Principles | Core design philosophy | Architectural guidelines |
| 🏗️ System Architecture | Overall system design | Structural patterns |
| 🧩 Core Components | Key system components | Implementation details |
| 🚀 Boot Process Design | System initialization | Boot workflow |
| 🏠 Tenant Management Design | Multi-tenant orchestration | Tenant lifecycle |
| 🛡️ Security Architecture | Security implementation | Protection mechanisms |
| 🌐 Networking Design | Network architecture | Connectivity patterns |
| 💾 Storage Architecture | Storage management | Data persistence |
| 🔧 Extension System Design | Extensibility framework | Plugin architecture |
| 📚 Git-Ops Workflow | Configuration management | Version control integration |
| 📊 Monitoring and Observability | System monitoring | Operational insights |
| ⚡ Performance Considerations | Performance optimization | Efficiency patterns |
| 🆘 Failure Recovery | Error handling and recovery | Resilience design |
BitBuilder Hypervisor represents a paradigm shift in hypervisor design, leveraging systemd's advanced virtualization capabilities to create a git-ops-native, multi-tenant virtualization platform. This document outlines the technical design decisions, architectural patterns, and implementation strategies that form the foundation of the system.
- Host OS remains immutable after boot
- All changes applied through layered overlays
- Configuration drift eliminated by design
- State declared in Git repositories
- System converges to desired state automatically
- No imperative configuration management
- Defense-in-depth architecture
- Multiple isolation boundaries per tenant
- Principle of least privilege throughout
- All configuration versioned in Git
- Audit trail built into the system
- Rollback capability inherent to design
- Small, focused components
- Standard interfaces (Varlink, D-Bus)
- Layered architecture with clear boundaries
The host system operates as a minimal, immutable foundation:
UEFI Firmware
┌─────────────────────────────────────────────────────────────┐
systemd-boot (UKI)
┌─────────────────────────────────────────────────────────────┐
Immutable Host OS (Downloaded DDI)
┌─────────────────────────────────────────────────────────────┐
systemd-import-generator systemd generators systemd
┌─────────────────────────────────────────────────────────────┐
Git Sync Service Tenant Manager Varlink
Each tenant operates in complete isolation:
Tenant Git Repo
┌─────────────────────────────────────────────────────────────┐
Tenant Configuration (metadata.json)
┌─────────────────────────────────────────────────────────────┐
sysext layers confext layers Services
┌─────────────────────────────────────────────────────────────┐
systemd-vmspawn / systemd-nspawn instance
┌─────────────────────────────────────────────────────────────┐
Network Namespace Mount Namespace PID Namespace
Responsible for host OS acquisition and verification:
- Downloads DDI images from configured sources
- Verifies cryptographic signatures
- Manages boot partition updates
- Handles A/B partition schemes for rollback
Configuration: /etc/systemd/import-generator.conf
[Import]
Source=https://releases.bitbuilder.io/hypervisor/
VerifySignature=yes
SignatureKeyring=/etc/pki/bitbuilder.gpg
UpdatePolicy=OnBootCentral orchestrator for tenant lifecycle:
Unit: tenant-manager.service
[Unit]
Description=BitBuilder Tenant Manager
After=network-online.target git-sync.service
Wants=network-online.target
[Service]
Type=notify
ExecStart=/usr/lib/bitbuilder/tenant-manager
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.targetManages repository synchronization:
Unit: git-sync@.service
[Unit]
Description=Git Sync for %i
After=network-online.target
[Service]
Type=oneshot
ExecStart=/usr/lib/bitbuilder/git-sync %i
StandardOutput=journal
StandardError=journal
PrivateTmp=yes
ProtectSystem=strict
ReadWritePaths=/var/lib/tenants/%i/config
[Install]
WantedBy=tenant-%i.targetCustom systemd generator for dynamic tenant configuration:
Location: /usr/lib/systemd/system-generators/tenant-generator
Generates:
- Mount units for tenant directories
- Service units for tenant VMs/containers
- Target units for tenant dependencies
- Timer units for periodic sync
- UEFI firmware loads systemd-boot
- systemd-boot verifies and loads UKI
- Secure Boot validation chain maintained
- Kernel and initrd from UKI execute
- systemd-import-generator checks for OS updates
- Downloads and verifies new DDI if available
- Mounts immutable root filesystem
- Git sync service pulls system configuration
- System-level sysext/confext layers applied
- Core services started (networking, storage)
- Tenant manager queries Git for tenant list
- For each discovered tenant:
- Generate systemd units via generators
- Clone/update tenant Git repository
- Create tenant directory structure
- Execute pre-provision scripts
- Apply tenant sysext/confext layers
- Configure network namespaces
- Start systemd-vmspawn/nspawn instances
- Execute post-provision scripts
metadata.json structure:
{
"version": "1.0",
"tenant": {
"id": "tenant-uuid",
"name": "tenant-name",
"type": "vm|container",
"enabled": true
},
"resources": {
"cpu": {
"cores": 4,
"shares": 1024
},
"memory": {
"limit": "4G",
"swap": "2G"
},
"storage": {
"root": "20G",
"data": "100G"
}
},
"network": {
"mode": "bridge|nat|host",
"interfaces": [
{
"name": "eth0",
"mac": "auto",
"ipv4": "dhcp|static",
"ipv6": "dhcp|static|disabled"
}
]
},
"extensions": {
"sysext": ["base-tools", "monitoring"],
"confext": ["security-policies", "network-config"]
},
"services": {
"portable": ["app-server", "database"],
"systemd": ["custom-service.service"]
},
"security": {
"selinux_context": "tenant_t",
"capabilities": ["CAP_NET_ADMIN"],
"syscalls": {
"allow": ["@basic-io", "@network-io"],
"deny": ["@privileged", "@reboot"]
}
}
}- Git repository created with initial configuration
- Tenant manager detects new repository
- Validates metadata.json schema
- Generates systemd units
- Provisions resources
- Starts tenant instance
- Git commit triggers webhook/polling
- Tenant manager pulls changes
- Validates configuration changes
- Applies changes based on strategy:
- Hot-reload for config changes
- Restart for service changes
- Re-provision for structural changes
- Tenant disabled in metadata.json
- Tenant manager stops all services
- Cleanup grace period
- Resources deallocated
- Data archived/deleted per policy
Multiple security layers protect each tenant:
-
Hardware Level
- UEFI Secure Boot
- TPM attestation
- Hardware virtualization (Intel VT-x/AMD-V)
-
Host OS Level
- Immutable root filesystem
- SELinux/AppArmor mandatory access control
- Minimal attack surface
-
Container/VM Level
- Namespace isolation (PID, NET, MNT, UTS, IPC)
- Capability dropping
- Seccomp filters
- Resource limits (cgroups)
-
Network Level
- Network namespaces
- Firewall rules per tenant
- VLAN isolation
- Encrypted communication (WireGuard/IPSec)
- JSON-RPC over Unix sockets
- Per-tenant socket isolation
- Authentication via SO_PEERCRED
- Optional TLS for network communication
Varlink
Tenant ĺ Host Service
SO_PEERCRED
(UID/GID/PID)
Physical Network
,
4
systemd-networkd
,
<
4 4 4
Bridge VLAN NAT
, , ,
Network Namespaces
- Direct layer 2 connectivity
- Tenant gets dedicated MAC address
- Suitable for trusted tenants
- 802.1Q VLAN tagging
- Hardware-level isolation
- Scalable to 4094 VLANs
- Private IP ranges per tenant
- Port forwarding for services
- Maximum isolation
/etc/systemd/network/10-tenant-bridge.netdev:
[NetDev]
Name=br-tenant
Kind=bridge
[Bridge]
STP=yes
Priority=32768/etc/systemd/network/10-tenant-bridge.network:
[Match]
Name=br-tenant
[Network]
DHCP=no
IPv6AcceptRA=no
IPForward=yes
IPMasquerade=yes
Address=10.0.0.1/24
[DHCPServer]
PoolOffset=100
PoolSize=100
EmitDNS=yes
DNS=10.0.0.1 Physical Storage (Block)
┌─────────────────────────────────────────┐
LVM/Btrfs/ZFS
┌─────────────────────────────────────────┐
Per-Tenant Logical Volumes
┌─────────────────────────────────────────┐
Root FS Data FS Config FS
- Over-commit storage resources
- Copy-on-write for efficiency
- Automatic space reclamation
- Point-in-time backups
- Instant rollback capability
- Incremental backup support
- Per-tenant LUKS volumes
- Key management via systemd-creds
- TPM-sealed keys optional
/var/lib/tenants/<tenant-name>/
├── config/ # Git repository (read-only bind mount)
├── data/ # Persistent data (read-write)
├── extensions/ # sysext/confext images
│ ├── sysext/
│ └── confext/
├── services/ # Portable service images
└── runtime/ # Ephemeral runtime data
Extend /usr and /opt hierarchies:
/usr/lib/extensions/
├── base-tools.raw # Common tools extension
├── monitoring.raw # Monitoring stack
└── security-tools.raw # Security utilities
Creation:
# Create sysext image
mkdir -p extension-root/usr/bin
cp tools extension-root/usr/bin/
mksquashfs extension-root base-tools.rawExtend /etc hierarchy:
/usr/lib/confext/
├── network-policies.raw # Network configurations
├── security-policies.raw # Security settings
└── monitoring-config.raw # Monitoring configs
/usr/lib/extension-release.d/extension-release.monitoring:
ID=bitbuilder-monitoring
VERSION_ID=1.0
SYSEXT_LEVEL=1.0
ARCHITECTURE=x86-64Self-contained service bundles:
portable-service.raw/
├── usr/
│ ├── bin/
│ │ └── service-binary
│ └── lib/
│ └── systemd/
│ └── system/
│ └── service.service
└── etc/
└── service/
└── config.conf
bitbuilder-system/
├── .gitops/
│ └── config.yaml # Git-ops configuration
├── generators/ # Custom systemd generators
├── units/ # Systemd service units
├── network/ # Network configurations
├── extensions/ # System-wide extensions
└── tenants/ # Tenant registry
└── registry.json # Active tenants list
tenant-<name>/
├── .gitops/
│ └── config.yaml # Tenant git-ops config
├── metadata.json # Tenant metadata
├── network/ # Network configurations
├── services/ # Service definitions
├── units/ # Custom systemd units
├── extensions/ # Tenant extensions
├── scripts/ # Lifecycle scripts
└── secrets/ # Encrypted secrets
└── sealed-secrets.yaml
Git Push
Webhook
Validation
Stage
,
Pull
Changes
,
Apply
Changes
,
Verify
State
- Automatic rollback on failure
- Git revert for configuration rollback
- Snapshot restore for data rollback
- A/B boot partitions for OS rollback
- CFS bandwidth control
- CPU shares per tenant
- NUMA awareness
- Memory limits via cgroups
- Swap accounting
- OOM killer configuration
- Block I/O weight
- Bandwidth throttling
- I/O scheduling classes
-
Lazy Loading
- On-demand tenant activation
- Delayed extension mounting
- Just-in-time compilation
-
Caching
- Git repository caching
- Extension image caching
- Network configuration caching
-
Parallel Processing
- Concurrent tenant provisioning
- Parallel Git operations
- Async service startup
Health Check Loop
Monitor
Services
,
Failure?
4
Yes
,
Recovery
Action
- Automatic restart (systemd)
- Exponential backoff
- Circuit breaker pattern
- Fallback to previous version
- VM/Container restart
- Re-provision from Git
- Restore from snapshot
- Evacuate to different host
- Watchdog timer reset
- Kernel panic recovery
- Boot to previous image
- Emergency maintenance mode
-
Regular Backups
- Git repositories (inherent)
- Tenant data snapshots
- Configuration exports
-
Replication
- Multi-region Git mirrors
- Cross-site tenant replication
- Distributed storage backends
-
Recovery Procedures
- Documented runbooks
- Automated recovery scripts
- Regular DR testing
The BitBuilder Hypervisor design represents a fundamental shift toward immutable, declarative infrastructure management. By leveraging systemd's advanced capabilities and git-ops principles, the system provides unprecedented levels of security, reproducibility, and operational simplicity while maintaining the flexibility required for multi-tenant environments.