Skip to content

[RFC]: Flatcar/Afterburn and azure-init Integration Proposal #1242

@peytonr18

Description

@peytonr18

Hi FedoraOS and coreos-afterburn community,

As part of the ongoing collaboration between our teams to integrate Flatcar and azure-init, @pothos suggested sharing the proposal document that @cadejacobson and I have been working on and asking for your feedback.

This document provides a high-level overview of the Azure provisioning process, including how WALinuxAgent, azure-init, and Afterburn currently interact with Flatcar. It then outlines two potential paths forward for integration.

The attached Markdown document outlines:

  • Background on Azure provisioning flows (ovf-env.xml, IMDS).
  • Current behavior of WALinuxAgent on Flatcar.
  • Capabilities and benefits of azure-init and libazureinit.
  • Overview of Afterburn’s current support and gaps.
  • The two proposed integration paths mentioned above.

We’re seeking feedback on:

  • Feasibility and alignment with Afterburn’s design goals.
  • Preferred integration approach.
  • Any concerns or suggestions from the community.
  • Potential implications for other distributions or use cases.

Please let us know your thoughts, concerns, or ideas. We’d love to collaborate on shaping this integration in a way that benefits the broader ecosystem. Thanks in advance for your time and input!

flatcar-afterburn-azureinit-integration.md.md


[RFC] Flatcar / Afterburn and Azure-init Integration Proposal

Overview

WALinuxAgent is an Azure-specific agent that interacts between Linux VMs and the Azure Fabric to provision VMs. The agent is a single binary which includes both the provisioning agent (PA) and the guest agent (GA). Due to its tight coupling with the GA, innovating the PA to meet current provisioning needs is challenging. WALinuxAgent's PA has been deprecated and replaced by azure-init.

While most endorsed distros use cloud-init for provisioning, some—including Flatcar—still rely on WALinuxAgent PA.

Background: Provisioning on Azure

Azure surfaces a UDF-media to the VM during provisioning, containing configuration in ovf-env.xml.

Configuration: ovf-env.xml

CRP Configuration Mapping ovf-env.xml Schema
osProfile.adminPassword ProvisioningSection.LinuxProvisioningConfigurationSet.UserPassword
osProfile.linuxConfiguration.ssh.publicKeys ProvisioningSection.LinuxProvisioningConfigurationSet.SSH.PublicKeys.PublicKey
osProfile.computerName ProvisioningSection.LinuxProvisioningConfigurationSet.HostName
osProfile.linuxConfiguration.disablePasswordAuthentication ProvisioningSection.LinuxProvisioningConfigurationSet.DisableSshPasswordAuthentication
osProfile.customdata ProvisioningSection.LinuxProvisioningConfigurationSet.CustomData

Configuration: Azure’s Instance Metadata Service (IMDS)

CRP Configuration Mapping IMDS Schema
osProfile.adminUsername Compute.osProfile.adminUsername
osProfile.computerName Compute.osProfile.computerName
osProfile.linuxConfiguration.disablePasswordAuthentication Compute.osProfile.diablePasswordAuthentication
osProfile.linuxConfiguration.ssh.publicKeys Compute.publicKeys

Background: WALinuxAgent on Flatcar

Configuration is found in /usr/share/waagent/waagent.conf.

Configuration: waagent.conf

Waagent.conf Variable Description Default Value
DVD.MountPoint DVD Mount Point /mnt/cdrom/secure/
ResourceDisk.MountPoint Resource Disk Mount Point /mnt/resource/
Os.LibDir Library Directory /var/lib/waagent/
Os.SshDir SSH Directory /etc/ssh/
ResourceDisk.FileSystem Resource Disk File System ext4

OVF-ENV

  • Reads ovf-env.xml from DVD Mount Point.
  • Writes modified ovf-env.xml to Library Directory.
  • Unmounts and ejects CDROM.
  • Returns unmodified version to PA for configuration.

The locally stored ovf-env.xml file is then available to the GA later as a read-only file.

Hostname

  • Reads hostname from ovf-env.xml.
  • Writes to /etc/hostname.
  • Publishes to published_hostname in Library Directory.

User Creation

WALinuxAgent creates an admin user with optional password and SSH keys as configured in the osProfile when creating a VM. The user is created via the useradd command, which also creates the user a personal group of the same name.

Password

If a password is supplied, the Provisioning Agent (PA):

  • Generates a password hash from the plain‑text password, crypt id, and salt length via a local hashing function using the crypt library.
  • Writes the password with:
    usermod -p {passwd_hash} {username}

If no password is supplied, the PA:

  • Locks the user with:
    passwd -l {username}

SSHD Configuration

WALinuxAgent configures SSHD for the most recent stable Flatcar via:

  • File name: 80-flatcar-walinuxagent.conf
  • Location: /etc/ssh/sshd_config.d/80-flatcar-walinuxagent.conf

This file contains configurations related to password authentication.

If no password is specified for the user, the PA sets:

PasswordAuthentication no
ChallengeResponseAuthentication no

Otherwise, if a password is specified, the PA sets:

PasswordAuthentication yes
ChallengeResponseAuthentication yes

SSH Keys

SSH public keys and their paths are read from ovf-env (XML). They are returned in pairs of a key and a path. The PA writes each key to its respective path.

Example OVF section:

<PublicKey>
  <Path>/home/azureuser/.ssh/authorized_keys</Path>
  <Value>ssh-rsa {key data}</Value>
</PublicKey>

The PA would then write ssh-rsa {key data} to the file:

/home/azureuser/.ssh/authorized_keys

SSH ClientAliveInterval:
The SSH Client Alive Interval in 80-flatcar-walinuxagent.conf is set via the OS.SshClientAliveInterval variable in the waagent.conf file.

Sudoers File

If an admin user password is supplied, the PA writes:
{username} ALL=(ALL) ALL to `/etc/sudoers.d/waagent.

Otherwise, if no password is supplied, the PA writes: {username} ALL=(ALL) NOPASSWD: ALL to /etc/sudoers.d/waagent.

Resource Disk Formatting

When ResourceDisk.Format=y is set in waagent.conf, WALinuxAgent performs:

  1. Locate the resource disk

    • Finds the SCSI resource disk (udev-mapped as /dev/disk/azure/resource) via SCSI controller IDs and LUN (logical unit number).
  2. Create/Normalize the partition table

    • If there is no valid partition or if the first partition has a Windows signature, WALinuxAgent will:
      • Set the partition type to Linux 0x83 (e.g., sfdisk --part-type /dev/sdb 1 83).
      • Ensure there is a single partition on the disk.
  3. Format the partition

    • Filesystem type is specified by ResourceDisk.Filesystem (default: ext4).
  4. Mount the resource disk

    • Controlled by ResourceDisk.MountPoint in waagent.conf (default: /mnt/resource).
  5. (Optional) Swap file management

    • If enabled, WALinuxAgent creates /mnt/resource/swapfile and activates it.

Note: Disabling the WALinuxAgent for provisioning (via Provisioning.Agent=disabled) does not disable WALinuxAgent’s resource-disk logic.

Reporting Ready

WALinuxAgent reports three scenarios: in progress, ready, and failure.

Report In Progress:

  • When provisioning begins, it reports a "provisioning in progress/transitioning" status to the Azure Fabric via Wireserver.

Report Ready:

  • After successful provisioning, the agent POSTs a Ready report to Wireserver that includes:
    • GoalState incarnation
    • ContainerID
    • InstanceID

Report Failure:

  • Provisioning failure is reported to Wireserver only in certain circumstances.
  • Example: OVF/DVD problems (e.g., no media on /dev/sr0) are reported back to the Azure control plane.

Background: azure-init

azure-init is a lightweight, Rust-based provisioning agent for Linux VMs on Azure. It is currently under active development as a minimal replacement for WALinuxAgent. The service handles initial instance configuration of:

  • Fetching metadata from IMDS
  • Setting the hostname
  • Creating an admin user
  • Setting SSH public keys
  • Reporting ready to the Wireserver
  • KVP telemetry of the early boot process

libazureinit

libazureinit is a Rust library crate that encapsulates the core provisioning logic used by azure-init, making it usable for third-party tools. The core functionality is exposed to the user through public APIs. Other Rust projects, such as Afterburn, can ingest that crate and then call the provisioning logic to modify the host machine.

Benefits Over Using WALinuxAgent:

Telemetry

Azure-init provides basic provisioning telemetry via KVP for the provisioning process. In the case of DHCP / networking failures or other issues like provisioning timeouts, KVP telemetry allows the user to gain some knowledge about the cause of the failure directly from the system. This depends on the hyperv-kvp-daemon.service.

Active Development

Unlike the WALinuxAgent provisioning agent, azure-init is under active development and maintenance. Given WALinuxAgent PA is deprecated, no new features are planned. Future changes to provisioning protocol may not be supported by WALinuxAgent.

Background: afterburn support for Azure

Originally coreos-metadata, now afterburn.

What It Does Today

Fetch Metadata

Afterburn has the ability to fetch metadata from IMDS via the endpoint with IPv4 address 169.254.169.254. This endpoint is used to retrieve the hostname and VM size.

Hostname

Afterburn in Flatcar on Azure sets the hostname by:

  • Making a GET request to IMDS for just the hostname.
  • Writing the hostname to the file specified in the CLI argument --hostname (for Flatcar, this is --hostname=/sysroot/etc/hostname)

However, writing this hostname is not permanent, as the WALinuxAgent overwrites it with its own hostname later.

What it Does Not Do Today (but capable of)

SSH Keys

Afterburn attempts to retrieve SSH keys via the Azure goalstate endpoint which returns certificates. These certificates can then be converted back to the OpenSSH format before writing the keys to the file:

~/.ssh/authorized_keys.d/afterburn

However, this channel only supports VMs created with x509 certificates, which is uncommon today.

Check-In

Afterburn supports a CLI flag (--check-in) that can be used to check in the instance boot with Azure. Afterburn has the capability to build an Azure Ready XML and POST it to the Fabric health endpoint. It builds this XML by probing /?comp=versions and reads the current goalstate to pick a compatible x-ms-version.

No mechanism exists today in Afterburn for reporting InProgress status or reporting failures.

Design: Afterburn vs. Azure-init

Universal Design Steps

Both options will set Provisioning.Agent=disabled in waagent.conf to prevent WALinuxAgent from handling creating the admin user, SSH keys, and hostnames.

Option (1): replace WALinuxAgent’s provisioning agent with azure-init.service

This approach keeps walinuxagent.service enabled only for the GA and introduces azure-init.service to own provisioning. This only disables the PA portion of WALinuxAgent and leaves the GA intact.

Key Steps

  • Install and enable the azure-init package and azure-init.service by default.
  • Disable walinuxagent PA.

What this means
Azure-init will handle:

  • Provisioning (user creation/SSH/hostname/metadata).
  • Health reporting (report ready, report failure).
  • Resource-disk logic will need to be delegated to Ignition or azure-vm-utils (azure-ephemeral-disk-setup.service).

Advantages Over Option 2
This removes the deprecated WALinuxAgent PA service in favor of its replacement service.

Option (2): Replace WALinuxAgent PA with Afterburn

This option proposes replacing the provisioning agent (PA) functionality of WALinuxAgent with an extended version of Afterburn that fills in any functionality gaps between itself and the WALinuxAgent.

The waagent.service remains for extensions/guest-agent features, while Afterburn assumes provisioning and health reporting responsibilities. Afterburn will be responsible for SSH key retrieval and installation (IMDS-based), admin user creation, hostname configuration, SSHD defaults, KVP telemetry, and structured success/failure reporting.

Key Steps
Extend Afterburn to handle for:

  • Provisioning (SSH keys, metadata, hostname).
  • Health reporting (ready/failure posts).
  • Optional: user creation (if not, this can be delegated to another program, potentially Ignition).
    Add updated Afterburn service files to Flatcar with the new arguments.

CLI Integration
Expanding on Afterburn’s CLI-enabled modules, a suite of provisioning features would be added, with opt-out flexibility for users. All CLI-enabled modules will be enabled by default.

Proposed flags

  • --add-user: create default user accounts and configure groups.
  • --configure-sshd: set up SSH daemon configuration, including authorized keys.
  • --enable-sudoers: add a user to the sudoers file with appropriate permissions.
  • --check-in (already exists in Afterburn): continue support for readiness signaling, and expand to support failure reporting as well.

Example afterburn invocation
afterburn --arg-name=arg

Implementation Approach

  • (Recommended) Afterburn Invoking libazureinit: Afterburn calls libazureinit’s public APIs for provisioning and health reporting.

  • (Alternative) Native Afterburn Implementation: Implement the same behaviors directly in Afterburn without libazureinit.

Design: Ephemeral Disk Setup

Ephemeral Disk Setup & Resource Disk Formatting

Azure VMs typically include a temporary SCSI resource disk (/dev/disk/azure/resource) or local NVMe disks (/dev/disk/azure/local/by-name/<name>).

In Flatcar, WALinuxAgent automatically partitions, formats, and mounts the SCSI resource disk at /mnt/resource. It does nothing for NVMe local disks.

Resource disk mounting and partitioning can be handled by:

  • WALinuxAgent (via ResourceDisk.Format=y), though this is slightly counterintuitive. Supports SCSI resource disks. No NVMe support.
  • Ignition-based config (tbd). This would involve a one-shot systemd unit that handles partitioning and formatting of the resource disk on first boot and mounts it after that.
  • azure-vm-utils's azure-ephemeral-disk-setup.service azure-vm-utils/ephemeral-disk-setup/azure-ephemeral-disk-setup at main. Supports SCSI and NVMe disks. Supports aggregation of multiple local NVMe disks.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions