Skip to content

Latest commit

 

History

History
687 lines (528 loc) · 18 KB

File metadata and controls

687 lines (528 loc) · 18 KB

🎯 BitBuilder Hypervisor - Technical Design Document

Technical Design SystemD Native Git-Ops Multi-Tenant

🚀 Paradigm shift in hypervisor design through systemd's advanced virtualization capabilities and git-ops-native architecture.


📋 Table of Contents

Section Description Focus Area
📖 Introduction System overview and paradigm shift Conceptual foundation
🎯 Design Principles Core design philosophy Architectural guidelines
🏗️ System Architecture Overall system design Structural patterns
🧩 Core Components Key system components Implementation details
🚀 Boot Process Design System initialization Boot workflow
🏠 Tenant Management Design Multi-tenant orchestration Tenant lifecycle
🛡️ Security Architecture Security implementation Protection mechanisms
🌐 Networking Design Network architecture Connectivity patterns
💾 Storage Architecture Storage management Data persistence
🔧 Extension System Design Extensibility framework Plugin architecture
📚 Git-Ops Workflow Configuration management Version control integration
📊 Monitoring and Observability System monitoring Operational insights
Performance Considerations Performance optimization Efficiency patterns
🆘 Failure Recovery Error handling and recovery Resilience design

📖 Introduction

BitBuilder Hypervisor represents a paradigm shift in hypervisor design, leveraging systemd's advanced virtualization capabilities to create a git-ops-native, multi-tenant virtualization platform. This document outlines the technical design decisions, architectural patterns, and implementation strategies that form the foundation of the system.

🎯 Design Principles

1️⃣ Immutability First

  • Host OS remains immutable after boot
  • All changes applied through layered overlays
  • Configuration drift eliminated by design

2️⃣ Declarative Configuration

  • State declared in Git repositories
  • System converges to desired state automatically
  • No imperative configuration management

3️⃣ Security by Isolation

  • Defense-in-depth architecture
  • Multiple isolation boundaries per tenant
  • Principle of least privilege throughout

4️⃣ Git as Single Source of Truth

  • All configuration versioned in Git
  • Audit trail built into the system
  • Rollback capability inherent to design

5️⃣ Composability

  • Small, focused components
  • Standard interfaces (Varlink, D-Bus)
  • Layered architecture with clear boundaries

🏗️ System Architecture

Host System Layer

The host system operates as a minimal, immutable foundation:

                         UEFI Firmware
┌─────────────────────────────────────────────────────────────┐
                      systemd-boot (UKI)
┌─────────────────────────────────────────────────────────────┐
              Immutable Host OS (Downloaded DDI)
┌─────────────────────────────────────────────────────────────┐
  systemd-import-generator  systemd generators  systemd
┌─────────────────────────────────────────────────────────────┐
         Git Sync Service    Tenant Manager     Varlink

Tenant Layer Architecture

Each tenant operates in complete isolation:

                        Tenant Git Repo
┌─────────────────────────────────────────────────────────────┐
              Tenant Configuration (metadata.json)
┌─────────────────────────────────────────────────────────────┐
    sysext layers         confext layers       Services
┌─────────────────────────────────────────────────────────────┐
          systemd-vmspawn / systemd-nspawn instance
┌─────────────────────────────────────────────────────────────┐
   Network Namespace    Mount Namespace    PID Namespace

🧩 Core Components

1. systemd-import-generator

Responsible for host OS acquisition and verification:

  • Downloads DDI images from configured sources
  • Verifies cryptographic signatures
  • Manages boot partition updates
  • Handles A/B partition schemes for rollback

Configuration: /etc/systemd/import-generator.conf

[Import]
Source=https://releases.bitbuilder.io/hypervisor/
VerifySignature=yes
SignatureKeyring=/etc/pki/bitbuilder.gpg
UpdatePolicy=OnBoot

2. Tenant Manager Service

Central orchestrator for tenant lifecycle:

Unit: tenant-manager.service

[Unit]
Description=BitBuilder Tenant Manager
After=network-online.target git-sync.service
Wants=network-online.target

[Service]
Type=notify
ExecStart=/usr/lib/bitbuilder/tenant-manager
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal

[Install]
WantedBy=multi-user.target

3. Git Sync Service

Manages repository synchronization:

Unit: git-sync@.service

[Unit]
Description=Git Sync for %i
After=network-online.target

[Service]
Type=oneshot
ExecStart=/usr/lib/bitbuilder/git-sync %i
StandardOutput=journal
StandardError=journal
PrivateTmp=yes
ProtectSystem=strict
ReadWritePaths=/var/lib/tenants/%i/config

[Install]
WantedBy=tenant-%i.target

4. Tenant Setup Generator

Custom systemd generator for dynamic tenant configuration:

Location: /usr/lib/systemd/system-generators/tenant-generator

Generates:

  • Mount units for tenant directories
  • Service units for tenant VMs/containers
  • Target units for tenant dependencies
  • Timer units for periodic sync

🚀 Boot Process Design

Stage 1: UEFI Boot

  1. UEFI firmware loads systemd-boot
  2. systemd-boot verifies and loads UKI
  3. Secure Boot validation chain maintained

Stage 2: Host OS Initialization

  1. Kernel and initrd from UKI execute
  2. systemd-import-generator checks for OS updates
  3. Downloads and verifies new DDI if available
  4. Mounts immutable root filesystem

Stage 3: System Configuration

  1. Git sync service pulls system configuration
  2. System-level sysext/confext layers applied
  3. Core services started (networking, storage)

Stage 4: Tenant Discovery

  1. Tenant manager queries Git for tenant list
  2. For each discovered tenant:
    • Generate systemd units via generators
    • Clone/update tenant Git repository
    • Create tenant directory structure

Stage 5: Tenant Provisioning

  1. Execute pre-provision scripts
  2. Apply tenant sysext/confext layers
  3. Configure network namespaces
  4. Start systemd-vmspawn/nspawn instances
  5. Execute post-provision scripts

🏠 Tenant Management Design

Tenant Metadata Schema

metadata.json structure:

{
  "version": "1.0",
  "tenant": {
    "id": "tenant-uuid",
    "name": "tenant-name",
    "type": "vm|container",
    "enabled": true
  },
  "resources": {
    "cpu": {
      "cores": 4,
      "shares": 1024
    },
    "memory": {
      "limit": "4G",
      "swap": "2G"
    },
    "storage": {
      "root": "20G",
      "data": "100G"
    }
  },
  "network": {
    "mode": "bridge|nat|host",
    "interfaces": [
      {
        "name": "eth0",
        "mac": "auto",
        "ipv4": "dhcp|static",
        "ipv6": "dhcp|static|disabled"
      }
    ]
  },
  "extensions": {
    "sysext": ["base-tools", "monitoring"],
    "confext": ["security-policies", "network-config"]
  },
  "services": {
    "portable": ["app-server", "database"],
    "systemd": ["custom-service.service"]
  },
  "security": {
    "selinux_context": "tenant_t",
    "capabilities": ["CAP_NET_ADMIN"],
    "syscalls": {
      "allow": ["@basic-io", "@network-io"],
      "deny": ["@privileged", "@reboot"]
    }
  }
}

Tenant Lifecycle Management

Creation

  1. Git repository created with initial configuration
  2. Tenant manager detects new repository
  3. Validates metadata.json schema
  4. Generates systemd units
  5. Provisions resources
  6. Starts tenant instance

Updates

  1. Git commit triggers webhook/polling
  2. Tenant manager pulls changes
  3. Validates configuration changes
  4. Applies changes based on strategy:
    • Hot-reload for config changes
    • Restart for service changes
    • Re-provision for structural changes

Deletion

  1. Tenant disabled in metadata.json
  2. Tenant manager stops all services
  3. Cleanup grace period
  4. Resources deallocated
  5. Data archived/deleted per policy

🛡️ Security Architecture

Defense in Depth

Multiple security layers protect each tenant:

  1. Hardware Level

    • UEFI Secure Boot
    • TPM attestation
    • Hardware virtualization (Intel VT-x/AMD-V)
  2. Host OS Level

    • Immutable root filesystem
    • SELinux/AppArmor mandatory access control
    • Minimal attack surface
  3. Container/VM Level

    • Namespace isolation (PID, NET, MNT, UTS, IPC)
    • Capability dropping
    • Seccomp filters
    • Resource limits (cgroups)
  4. Network Level

    • Network namespaces
    • Firewall rules per tenant
    • VLAN isolation
    • Encrypted communication (WireGuard/IPSec)

Secure Communication

Varlink Protocol

  • JSON-RPC over Unix sockets
  • Per-tenant socket isolation
  • Authentication via SO_PEERCRED
  • Optional TLS for network communication

Service Authentication

     Varlink
  Tenant   ĺ  Host Service


      SO_PEERCRED
              (UID/GID/PID)

🌐 Networking Design

Network Architecture


               Physical Network
,

           4
              systemd-networkd
           ,

        <

   4   4   4
    Bridge       VLAN        NAT
   ,   ,   ,


            Network Namespaces

Network Isolation Strategies

Bridge Mode

  • Direct layer 2 connectivity
  • Tenant gets dedicated MAC address
  • Suitable for trusted tenants

VLAN Mode

  • 802.1Q VLAN tagging
  • Hardware-level isolation
  • Scalable to 4094 VLANs

NAT Mode

  • Private IP ranges per tenant
  • Port forwarding for services
  • Maximum isolation

Network Configuration

/etc/systemd/network/10-tenant-bridge.netdev:

[NetDev]
Name=br-tenant
Kind=bridge

[Bridge]
STP=yes
Priority=32768

/etc/systemd/network/10-tenant-bridge.network:

[Match]
Name=br-tenant

[Network]
DHCP=no
IPv6AcceptRA=no
IPForward=yes
IPMasquerade=yes
Address=10.0.0.1/24

[DHCPServer]
PoolOffset=100
PoolSize=100
EmitDNS=yes
DNS=10.0.0.1

💾 Storage Architecture

Storage Layers

           Physical Storage (Block)
┌─────────────────────────────────────────┐
              LVM/Btrfs/ZFS
┌─────────────────────────────────────────┐
          Per-Tenant Logical Volumes
┌─────────────────────────────────────────┐
   Root FS      Data FS      Config FS

Storage Management

Thin Provisioning

  • Over-commit storage resources
  • Copy-on-write for efficiency
  • Automatic space reclamation

Snapshots

  • Point-in-time backups
  • Instant rollback capability
  • Incremental backup support

Encryption

  • Per-tenant LUKS volumes
  • Key management via systemd-creds
  • TPM-sealed keys optional

Mount Structure

/var/lib/tenants/<tenant-name>/
├── config/                             # Git repository (read-only bind mount)
├── data/                               # Persistent data (read-write)
├── extensions/                         # sysext/confext images
│   ├── sysext/
│   └── confext/
├── services/                           # Portable service images
└── runtime/                            # Ephemeral runtime data

🔧 Extension System Design

System Extensions (sysext)

Extend /usr and /opt hierarchies:

/usr/lib/extensions/
├── base-tools.raw                      # Common tools extension
├── monitoring.raw                      # Monitoring stack
└── security-tools.raw                  # Security utilities

Creation:

# Create sysext image
mkdir -p extension-root/usr/bin
cp tools extension-root/usr/bin/
mksquashfs extension-root base-tools.raw

Configuration Extensions (confext)

Extend /etc hierarchy:

/usr/lib/confext/
├── network-policies.raw                # Network configurations
├── security-policies.raw               # Security settings
└── monitoring-config.raw               # Monitoring configs

Extension Metadata

/usr/lib/extension-release.d/extension-release.monitoring:

ID=bitbuilder-monitoring
VERSION_ID=1.0
SYSEXT_LEVEL=1.0
ARCHITECTURE=x86-64

Portable Services

Self-contained service bundles:

portable-service.raw/
├── usr/
│   ├── bin/
│   │   └── service-binary
│   └── lib/
│       └── systemd/
│           └── system/
│               └── service.service
└── etc/
    └── service/
        └── config.conf

📚 Git-Ops Workflow

Repository Structure

System Repository

bitbuilder-system/
├── .gitops/
│   └── config.yaml                     # Git-ops configuration
├── generators/                         # Custom systemd generators
├── units/                              # Systemd service units
├── network/                            # Network configurations
├── extensions/                         # System-wide extensions
└── tenants/                            # Tenant registry
    └── registry.json                   # Active tenants list

Tenant Repository

tenant-<name>/
├── .gitops/
│   └── config.yaml                     # Tenant git-ops config
├── metadata.json                       # Tenant metadata
├── network/                            # Network configurations
├── services/                           # Service definitions
├── units/                              # Custom systemd units
├── extensions/                         # Tenant extensions
├── scripts/                            # Lifecycle scripts
└── secrets/                            # Encrypted secrets
    └── sealed-secrets.yaml

Continuous Deployment

   Git Push


   Webhook



  Validation
    Stage
,



    Pull
   Changes
,



   Apply
   Changes
,



   Verify
    State

Rollback Mechanism

  1. Automatic rollback on failure
  2. Git revert for configuration rollback
  3. Snapshot restore for data rollback
  4. A/B boot partitions for OS rollback

⚡ Performance Considerations

Resource Management

CPU Scheduling

  • CFS bandwidth control
  • CPU shares per tenant
  • NUMA awareness

Memory Management

  • Memory limits via cgroups
  • Swap accounting
  • OOM killer configuration

I/O Management

  • Block I/O weight
  • Bandwidth throttling
  • I/O scheduling classes

Optimization Strategies

  1. Lazy Loading

    • On-demand tenant activation
    • Delayed extension mounting
    • Just-in-time compilation
  2. Caching

    • Git repository caching
    • Extension image caching
    • Network configuration caching
  3. Parallel Processing

    • Concurrent tenant provisioning
    • Parallel Git operations
    • Async service startup

🆘 Failure Recovery

Failure Detection

Health Check Loop



   Monitor
   Services
,

   Failure?

   4
     Yes
   ,



  Recovery
   Action

Recovery Strategies

Service Level

  1. Automatic restart (systemd)
  2. Exponential backoff
  3. Circuit breaker pattern
  4. Fallback to previous version

Tenant Level

  1. VM/Container restart
  2. Re-provision from Git
  3. Restore from snapshot
  4. Evacuate to different host

Host Level

  1. Watchdog timer reset
  2. Kernel panic recovery
  3. Boot to previous image
  4. Emergency maintenance mode

Disaster Recovery

  1. Regular Backups

    • Git repositories (inherent)
    • Tenant data snapshots
    • Configuration exports
  2. Replication

    • Multi-region Git mirrors
    • Cross-site tenant replication
    • Distributed storage backends
  3. Recovery Procedures

    • Documented runbooks
    • Automated recovery scripts
    • Regular DR testing

Conclusion

The BitBuilder Hypervisor design represents a fundamental shift toward immutable, declarative infrastructure management. By leveraging systemd's advanced capabilities and git-ops principles, the system provides unprecedented levels of security, reproducibility, and operational simplicity while maintaining the flexibility required for multi-tenant environments.