feat(kvp): add store trait and Hyper-V KVP storage crate by peytonr18 · Pull Request #288 · Azure/azure-init

peytonr18 · 2026-03-09T19:03:47Z

Summary

Introduces libazureinit-kvp, a standalone workspace crate that defines the KvpStore storage trait and provides a production implementation backed by the Hyper-V binary pool file format.

This is the foundational layer (Layer 0) for the KVP architecture rework. It establishes the storage abstraction that higher layers (diagnostics, tracing, provisioning reports) will build on in follow-up PRs.

Adds new workspace crate libazureinit-kvp with KvpStore trait and HyperVKvpStore implementation.
Defines KvpLimits as a trait-level requirement so consumers and alternative implementations can query size constraints generically.
Rewrites doc/libazurekvp.md as a technical reference for the new crate.
No consumers wired yet — libazureinit still uses the existing kvp.rs module. Integration happens in follow-up PRs.

Changes

`KvpStore` trait

Storage abstraction with explicit semantics:

limits() — returns KvpLimits for the store (required by the trait so consumers can chunk/validate generically)
write(key, value) — validates against limits, then stores
read(key) — last-write-wins for duplicate keys
entries() — deduplicated HashMap view
delete(key) — removes all records for a key

`HyperVKvpStore`

Production implementation backed by the Hyper-V binary pool file:

Fixed-size 2,560-byte records (512-byte key + 2,048-byte value, zero-padded)
Append-only writes with flock-based concurrency (exclusive for writes, shared for reads)
truncate_if_stale() clears records from previous boots by comparing file mtime to boot time
Write validation rejects empty keys, oversized keys, and oversized values (no silent truncation)

`KvpLimits`

PR feedback asked for limits to be required at the trait level rather than only on the concrete type. This ensures:

Any KvpStore implementation must declare its constraints.
Generic consumer code (fn foo<S: KvpStore>(store: &S)) can call store.limits().max_value_size for chunking decisions without hardcoding constants or knowing the concrete type.

Exported as part of the trait contract with two presets:

KvpLimits::hyperv() — 512-byte key / 2,048-byte value (raw Hyper-V wire format)
KvpLimits::azure() — 512-byte key / 1,022-byte value (UTF-16: 511 chars + null terminator; matches Azure host read limit)

Higher layers are responsible for chunking oversized payloads before calling the store.

Next Steps

Follow-up PRs will add higher layers on top of this trait (Connected to #287):

Diagnostics layer (typed events, value splitting at the 1,022-byte Azure limit)
Tracing layer (tracing_subscriber::Layer adapter)
Provisioning report accessor
Integration into libazureinit/src/logging.rs to replace the existing kvp.rs module

cadejacobson · 2026-03-09T19:41:00Z

libazureinit-kvp/src/lib.rs

+pub const HYPERV_MAX_VALUE_BYTES: usize = 2048;
+/// Azure key limit in bytes (policy/default preset).
+pub const AZURE_MAX_KEY_BYTES: usize = 512;
+/// Azure value limit in bytes (UTF-16: 511 characters + null terminator).


Do you have a reference for the null termination of this? I opened up that link in the previous file, but did not see a reference to null termination

I was basing this off of what we have in our existing kvp.rs --

const HV_KVP_EXCHANGE_MAX_KEY_SIZE: usize = 512; const HV_KVP_EXCHANGE_MAX_VALUE_SIZE: usize = 2048; const HV_KVP_AZURE_MAX_VALUE_SIZE: usize = 1022;

and my old notes from that KVP process, but I'll look for the official documentation where it says that!

doc/libazurekvp.md

libazureinit-kvp/src/hyperv.rs

Copilot

Pull request overview

Adds a new standalone workspace crate (libazureinit-kvp) that defines a storage abstraction (KvpStore) and a production Hyper-V pool-file implementation (HyperVKvpStore), along with updated technical reference documentation.

Changes:

Introduces libazureinit-kvp crate with KvpStore and KvpLimits (Hyper-V/Azure presets).
Implements Hyper-V binary pool-file backend with flock-based concurrency and stale-file truncation support.
Rewrites doc/libazurekvp.md as a technical reference for the new crate/API.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
`Cargo.toml`	Adds `libazureinit-kvp` to the workspace members.
`libazureinit-kvp/Cargo.toml`	Defines the new crate and its dependencies (`fs2`, `sysinfo`).
`libazureinit-kvp/src/lib.rs`	Defines `KvpStore`, `KvpLimits`, and exports `HyperVKvpStore` plus size constants.
`libazureinit-kvp/src/hyperv.rs`	Implements Hyper-V pool-file encoding/decoding and the `KvpStore` backend, plus unit tests.
`doc/libazurekvp.md`	Updates documentation to describe the new crate’s record format, semantics, truncation behavior, and limits.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-09T22:37:17Z

libazureinit-kvp/src/hyperv.rs

+    let key = String::from_utf8(data[..HV_KVP_EXCHANGE_MAX_KEY_SIZE].to_vec())
+        .unwrap_or_default()
+        .trim_end_matches('\0')
+        .to_string();
+
+    let value =
+        String::from_utf8(data[HV_KVP_EXCHANGE_MAX_KEY_SIZE..].to_vec())
+            .unwrap_or_default()
+            .trim_end_matches('\0')
+            .to_string();
+
+    Ok((key, value))
+}
+
+/// Read all records from a file that is already open and locked.
+fn read_all_records(file: &mut File) -> io::Result<Vec<(String, String)>> {
+    let mut contents = Vec::new();
+    file.read_to_end(&mut contents)?;
+
+    if contents.is_empty() {
+        return Ok(Vec::new());
+    }
+
+    if contents.len() % RECORD_SIZE != 0 {
+        return Err(io::Error::other(format!(
+            "file size ({}) is not a multiple of record size ({RECORD_SIZE})",
+            contents.len()
+        )));
+    }
+
+    contents
+        .chunks_exact(RECORD_SIZE)
+        .map(decode_record)
+        .collect()


read_all_records reads the entire pool file into memory and then allocates again per record in decode_record (via to_vec()). Since this layer explicitly has no record-count cap, this can become unbounded memory usage for read()/entries()/delete(). Consider streaming records with a fixed-size buffer (and decoding from slices without per-record allocations).

Suggested change

let key = String::from_utf8(data[..HV_KVP_EXCHANGE_MAX_KEY_SIZE].to_vec())

.unwrap_or_default()

.trim_end_matches('\0')

.to_string();

let value =

String::from_utf8(data[HV_KVP_EXCHANGE_MAX_KEY_SIZE..].to_vec())

.unwrap_or_default()

.trim_end_matches('\0')

.to_string();

Ok((key, value))

}

/// Read all records from a file that is already open and locked.

fn read_all_records(file: &mut File) -> io::Result<Vec<(String, String)>> {

let mut contents = Vec::new();

file.read_to_end(&mut contents)?;

if contents.is_empty() {

return Ok(Vec::new());

}

if contents.len() % RECORD_SIZE != 0 {

return Err(io::Error::other(format!(

"file size ({}) is not a multiple of record size ({RECORD_SIZE})",

contents.len()

)));

}

contents

.chunks_exact(RECORD_SIZE)

.map(decode_record)

.collect()

let (key_bytes, value_bytes) =

data.split_at(HV_KVP_EXCHANGE_MAX_KEY_SIZE);

let key = std::str::from_utf8(key_bytes)

.map(|s| s.trim_end_matches('\0').to_owned())

.unwrap_or_default();

let value = std::str::from_utf8(value_bytes)

.map(|s| s.trim_end_matches('\0').to_owned())

.unwrap_or_default();

Ok((key, value))

}

/// Read all records from a file that is already open and locked.

fn read_all_records(file: &mut File) -> io::Result<Vec<(String, String)>> {

let metadata = file.metadata()?;

let len = metadata.len() as usize;

if len == 0 {

return Ok(Vec::new());

}

if len % RECORD_SIZE != 0 {

return Err(io::Error::other(format!(

"file size ({}) is not a multiple of record size ({RECORD_SIZE})",

len

)));

}

// Ensure we start reading from the beginning of the file.

file.seek(std::io::SeekFrom::Start(0))?;

let record_count = len / RECORD_SIZE;

let mut records = Vec::with_capacity(record_count);

let mut buf = [0u8; RECORD_SIZE];

for _ in 0..record_count {

file.read_exact(&mut buf)?;

records.push(decode_record(&buf)?);

}

Ok(records)

Copilot · 2026-03-09T22:37:18Z

libazureinit-kvp/src/hyperv.rs

+    fn test_truncate_if_stale_truncates_when_file_is_older_than_boot() {
+        let tmp = NamedTempFile::new().unwrap();
+        let store = hyperv_store(tmp.path());
+
+        store.write("key", "value").unwrap();
+        assert!(tmp.path().metadata().unwrap().len() > 0);
+
+        truncate_with_boot_time(tmp.path(), i64::MAX).unwrap();
+        assert_eq!(tmp.path().metadata().unwrap().len(), 0);
+    }
+
+    #[test]
+    fn test_truncate_if_stale_keeps_file_when_newer_than_boot() {
+        let tmp = NamedTempFile::new().unwrap();
+        let store = hyperv_store(tmp.path());
+
+        store.write("key", "value").unwrap();
+        let len_before = tmp.path().metadata().unwrap().len();
+
+        // Epoch boot time ensures any current file mtime is considered fresh.
+        truncate_with_boot_time(tmp.path(), 0).unwrap();
+        assert_eq!(tmp.path().metadata().unwrap().len(), len_before);


These tests are named as if they validate HyperVKvpStore::truncate_if_stale, but they call truncate_with_boot_time instead, so the public method (and its locking/mtime logic) is untested. Either update the tests to exercise store.truncate_if_stale() (ideally including the lock-skip path) or rename the tests to match what they actually cover.

Copilot · 2026-03-09T22:37:18Z

doc/libazurekvp.md

+- If lock contention occurs (`WouldBlock`): return `Ok(())` and skip.
+- Non-contention lock failures are returned as errors.


The docs state that only WouldBlock lock contention is skipped and other lock errors are returned, but the current truncate_if_stale implementation returns Ok(()) on any try_lock_exclusive error. Either tighten the implementation to match this behavior, or adjust this section to reflect the actual error handling.

Suggested change

- If lock contention occurs (`WouldBlock`): return `Ok(())` and skip.

- Non-contention lock failures are returned as errors.

- If any lock error occurs (including `WouldBlock` contention): return `Ok(())` and skip truncation, leaving the file unchanged.

- Lock acquisition failures are treated as non-fatal and are not propagated as errors from this method.

Copilot · 2026-03-09T22:37:18Z

libazureinit-kvp/src/hyperv.rs

+        if FileExt::try_lock_exclusive(&file).is_err() {
+            return Ok(());
+        }


truncate_if_stale currently treats any try_lock_exclusive failure as lock contention and returns Ok(()). This can silently ignore real errors (e.g., permission/IO errors). Check the error kind and only skip on WouldBlock; propagate other errors (and consider updating the doc comment accordingly).

Copilot · 2026-03-09T22:37:18Z

libazureinit-kvp/src/hyperv.rs

+    let key = String::from_utf8(data[..HV_KVP_EXCHANGE_MAX_KEY_SIZE].to_vec())
+        .unwrap_or_default()
+        .trim_end_matches('\0')
+        .to_string();
+
+    let value =
+        String::from_utf8(data[HV_KVP_EXCHANGE_MAX_KEY_SIZE..].to_vec())
+            .unwrap_or_default()
+            .trim_end_matches('\0')
+            .to_string();


decode_record uses String::from_utf8(...).unwrap_or_default(), which silently turns invalid UTF-8 into an empty key/value. This can cause data loss and incorrect behavior (e.g., inserting an empty key into entries() or missing matches in read()). Return an InvalidData/io::Error on UTF-8 decode failure instead of defaulting.

Suggested change

let key = String::from_utf8(data[..HV_KVP_EXCHANGE_MAX_KEY_SIZE].to_vec())

.unwrap_or_default()

.trim_end_matches('\0')

.to_string();

let value =

String::from_utf8(data[HV_KVP_EXCHANGE_MAX_KEY_SIZE..].to_vec())

.unwrap_or_default()

.trim_end_matches('\0')

.to_string();

let key_bytes = &data[..HV_KVP_EXCHANGE_MAX_KEY_SIZE];

let key = String::from_utf8(key_bytes.to_vec())

.map_err(|e| io::Error::new(ErrorKind::InvalidData, e))?;

let key = key.trim_end_matches('\0').to_string();

let value_bytes = &data[HV_KVP_EXCHANGE_MAX_KEY_SIZE..];

let value = String::from_utf8(value_bytes.to_vec())

.map_err(|e| io::Error::new(ErrorKind::InvalidData, e))?;

let value = value.trim_end_matches('\0').to_string();

cjp256 · 2026-03-10T02:09:47Z

doc/libazurekvp.md

@@ -1,126 +1,88 @@
-# Azure-init Tracing System
+# `libazureinit-kvp`


can we delete this file in favor of the source rustdocs for now? I'd rather focus on great docs in the source modules and keep this at really high level, or delete it. There's a lot of duplication as it stands...

cjp256 · 2026-03-10T02:10:20Z

libazureinit-kvp/src/lib.rs

+pub use hyperv::HyperVKvpStore;
+
+/// Hyper-V key limit in bytes (policy/default preset).
+pub const HYPERV_MAX_KEY_BYTES: usize = 512;


shouldn't be here (dupe of hyperv)

cjp256 · 2026-03-10T02:10:49Z

libazureinit-kvp/src/lib.rs

+/// Key/value byte limits for write validation.
+#[derive(Clone, Copy, Debug, PartialEq, Eq)]
+pub struct KvpLimits {
+    pub max_key_size: usize,


let's inline these into the trait

and have each implementation define them as their own const.

cjp256 · 2026-03-10T02:11:37Z

libazureinit-kvp/src/lib.rs

+    /// Consumers (e.g. diagnostics, tracing layers) should call this
+    /// instead of hardcoding size constants, so the limits stay correct
+    /// regardless of the underlying implementation.
+    fn limits(&self) -> KvpLimits;


remove in favor of two consts.

cjp256 · 2026-03-10T02:14:32Z

libazureinit-kvp/src/lib.rs

+    ///
+    /// Keys are deduplicated using last-write-wins semantics, matching
+    /// the behavior of [`read`](KvpStore::read).
+    fn entries(&self) -> io::Result<HashMap<String, String>>;


can we expose one more entries_raw(), or dump() which yields a list of key/value pairs that aren't necessarily unique?

this may be useful for testing or some dump command

cjp256 · 2026-03-10T02:25:15Z

libazureinit-kvp/src/lib.rs

+/// Storage abstraction for KVP backends.
+///
+/// Semantics:
+/// - `write`: stores one key/value or returns validation/I/O error.


these details should be consolidated into the function docs. This should just be a high level doc for the trait.

e.g. key-value store that supports the semantics of Hyper-V's KVP while also being generic to support non-file-backed-pool-KVP in-memory implementation, for tests etc.

cjp256 · 2026-03-10T02:26:03Z

libazureinit-kvp/src/lib.rs

+    ///
+    /// Returns an error if:
+    /// - The key is empty.
+    /// - The key exceeds the configured maximum key size.


what are allowed values for the key characters? are there restrictions?

cjp256 · 2026-03-10T02:28:00Z

libazureinit-kvp/src/lib.rs

+
+    /// Write a key-value pair into the store.
+    ///
+    /// Returns an error if:


these errors should be defined here so we can be specific about what is being returned under what conditions

cjp256 · 2026-03-10T02:30:17Z

libazureinit-kvp/src/hyperv.rs

+//! - No record-count header or explicit record cap in this layer.
+//!
+//! ## Behavior summary
+//! - **`write`**: append-only; one record appended per call.


leave these to function docs

cjp256 · 2026-03-10T02:32:59Z

libazureinit-kvp/src/hyperv.rs

+//! - No record-count header or explicit record cap in this layer.
+//!
+//! ## Behavior summary
+//! - **`write`**: append-only; one record appended per call.


Keep the module docs relevant to the Hyper-V KVP pool / protocol and don't include any rust implementations details. The implementation should flow from this description and be detailed in the impl docs.

cjp256 · 2026-03-10T02:35:49Z

libazureinit-kvp/src/hyperv.rs

+    pub fn new_autodetect(path: impl Into<PathBuf>) -> Self {
+        let limits = if is_azure_vm(None) {


this is clever, but let's not do this. The caller should decide with implementation to work with if it's separated in two different implementations. If it's one implementation, then constrain it to a required bool in the new() call, e.g. new(azure=true)

feat(kvp): add store trait and Hyper-V KVP storage crate

f435f70

peytonr18 force-pushed the probertson/kvp-store-trait branch from b134a98 to f435f70 Compare March 9, 2026 19:19

cadejacobson requested changes Mar 9, 2026

View reviewed changes

add Azure autodetect limits selection for HyperVKvpStore

5226e32

cadejacobson self-requested a review March 9, 2026 20:42

Improving libazurekvp.md clarity

c1440c9

peytonr18 requested a review from Copilot March 9, 2026 22:34

Copilot started reviewing on behalf of peytonr18 March 9, 2026 22:34 View session

Copilot AI reviewed Mar 9, 2026

View reviewed changes

cjp256 reviewed Mar 10, 2026

View reviewed changes

		- If lock contention occurs (`WouldBlock`): return `Ok(())` and skip.
		- Non-contention lock failures are returned as errors.

		@@ -1,126 +1,88 @@
		# Azure-init Tracing System
		# `libazureinit-kvp`

		pub fn new_autodetect(path: impl Into<PathBuf>) -> Self {
		let limits = if is_azure_vm(None) {

Conversation

peytonr18 commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

KvpStore trait

HyperVKvpStore

KvpLimits

Next Steps

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cjp256 Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

peytonr18 commented Mar 9, 2026 •

edited

Loading

`KvpStore` trait

`HyperVKvpStore`

`KvpLimits`

cjp256 Mar 10, 2026 •

edited

Loading