Skip to content

feat(kvp): add store trait and Hyper-V KVP storage crate#288

Open
peytonr18 wants to merge 3 commits intoAzure:mainfrom
peytonr18:probertson/kvp-store-trait
Open

feat(kvp): add store trait and Hyper-V KVP storage crate#288
peytonr18 wants to merge 3 commits intoAzure:mainfrom
peytonr18:probertson/kvp-store-trait

Conversation

@peytonr18
Copy link
Contributor

@peytonr18 peytonr18 commented Mar 9, 2026

Summary

Introduces libazureinit-kvp, a standalone workspace crate that defines the KvpStore storage trait and provides a production implementation backed by the Hyper-V binary pool file format.

This is the foundational layer (Layer 0) for the KVP architecture rework. It establishes the storage abstraction that higher layers (diagnostics, tracing, provisioning reports) will build on in follow-up PRs.

  • Adds new workspace crate libazureinit-kvp with KvpStore trait and HyperVKvpStore implementation.
  • Defines KvpLimits as a trait-level requirement so consumers and alternative implementations can query size constraints generically.
  • Rewrites doc/libazurekvp.md as a technical reference for the new crate.
  • No consumers wired yet — libazureinit still uses the existing kvp.rs module. Integration happens in follow-up PRs.

Changes

KvpStore trait

Storage abstraction with explicit semantics:

  • limits() — returns KvpLimits for the store (required by the trait so consumers can chunk/validate generically)
  • write(key, value) — validates against limits, then stores
  • read(key) — last-write-wins for duplicate keys
  • entries() — deduplicated HashMap view
  • delete(key) — removes all records for a key

HyperVKvpStore

Production implementation backed by the Hyper-V binary pool file:

  • Fixed-size 2,560-byte records (512-byte key + 2,048-byte value, zero-padded)
  • Append-only writes with flock-based concurrency (exclusive for writes, shared for reads)
  • truncate_if_stale() clears records from previous boots by comparing file mtime to boot time
  • Write validation rejects empty keys, oversized keys, and oversized values (no silent truncation)

KvpLimits

PR feedback asked for limits to be required at the trait level rather than only on the concrete type. This ensures:

  • Any KvpStore implementation must declare its constraints.
  • Generic consumer code (fn foo<S: KvpStore>(store: &S)) can call store.limits().max_value_size for chunking decisions without hardcoding constants or knowing the concrete type.

Exported as part of the trait contract with two presets:

  • KvpLimits::hyperv() — 512-byte key / 2,048-byte value (raw Hyper-V wire format)
  • KvpLimits::azure() — 512-byte key / 1,022-byte value (UTF-16: 511 chars + null terminator; matches Azure host read limit)

Higher layers are responsible for chunking oversized payloads before calling the store.

Next Steps

Follow-up PRs will add higher layers on top of this trait (Connected to #287):

  1. Diagnostics layer (typed events, value splitting at the 1,022-byte Azure limit)
  2. Tracing layer (tracing_subscriber::Layer adapter)
  3. Provisioning report accessor
  4. Integration into libazureinit/src/logging.rs to replace the existing kvp.rs module

@peytonr18 peytonr18 force-pushed the probertson/kvp-store-trait branch from b134a98 to f435f70 Compare March 9, 2026 19:19
pub const HYPERV_MAX_VALUE_BYTES: usize = 2048;
/// Azure key limit in bytes (policy/default preset).
pub const AZURE_MAX_KEY_BYTES: usize = 512;
/// Azure value limit in bytes (UTF-16: 511 characters + null terminator).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a reference for the null termination of this? I opened up that link in the previous file, but did not see a reference to null termination

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was basing this off of what we have in our existing kvp.rs --

const HV_KVP_EXCHANGE_MAX_KEY_SIZE: usize = 512;
const HV_KVP_EXCHANGE_MAX_VALUE_SIZE: usize = 2048;
const HV_KVP_AZURE_MAX_VALUE_SIZE: usize = 1022;

and my old notes from that KVP process, but I'll look for the official documentation where it says that!

@cadejacobson cadejacobson self-requested a review March 9, 2026 20:42
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new standalone workspace crate (libazureinit-kvp) that defines a storage abstraction (KvpStore) and a production Hyper-V pool-file implementation (HyperVKvpStore), along with updated technical reference documentation.

Changes:

  • Introduces libazureinit-kvp crate with KvpStore and KvpLimits (Hyper-V/Azure presets).
  • Implements Hyper-V binary pool-file backend with flock-based concurrency and stale-file truncation support.
  • Rewrites doc/libazurekvp.md as a technical reference for the new crate/API.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
Cargo.toml Adds libazureinit-kvp to the workspace members.
libazureinit-kvp/Cargo.toml Defines the new crate and its dependencies (fs2, sysinfo).
libazureinit-kvp/src/lib.rs Defines KvpStore, KvpLimits, and exports HyperVKvpStore plus size constants.
libazureinit-kvp/src/hyperv.rs Implements Hyper-V pool-file encoding/decoding and the KvpStore backend, plus unit tests.
doc/libazurekvp.md Updates documentation to describe the new crate’s record format, semantics, truncation behavior, and limits.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +244 to +277
let key = String::from_utf8(data[..HV_KVP_EXCHANGE_MAX_KEY_SIZE].to_vec())
.unwrap_or_default()
.trim_end_matches('\0')
.to_string();

let value =
String::from_utf8(data[HV_KVP_EXCHANGE_MAX_KEY_SIZE..].to_vec())
.unwrap_or_default()
.trim_end_matches('\0')
.to_string();

Ok((key, value))
}

/// Read all records from a file that is already open and locked.
fn read_all_records(file: &mut File) -> io::Result<Vec<(String, String)>> {
let mut contents = Vec::new();
file.read_to_end(&mut contents)?;

if contents.is_empty() {
return Ok(Vec::new());
}

if contents.len() % RECORD_SIZE != 0 {
return Err(io::Error::other(format!(
"file size ({}) is not a multiple of record size ({RECORD_SIZE})",
contents.len()
)));
}

contents
.chunks_exact(RECORD_SIZE)
.map(decode_record)
.collect()
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_all_records reads the entire pool file into memory and then allocates again per record in decode_record (via to_vec()). Since this layer explicitly has no record-count cap, this can become unbounded memory usage for read()/entries()/delete(). Consider streaming records with a fixed-size buffer (and decoding from slices without per-record allocations).

Suggested change
let key = String::from_utf8(data[..HV_KVP_EXCHANGE_MAX_KEY_SIZE].to_vec())
.unwrap_or_default()
.trim_end_matches('\0')
.to_string();
let value =
String::from_utf8(data[HV_KVP_EXCHANGE_MAX_KEY_SIZE..].to_vec())
.unwrap_or_default()
.trim_end_matches('\0')
.to_string();
Ok((key, value))
}
/// Read all records from a file that is already open and locked.
fn read_all_records(file: &mut File) -> io::Result<Vec<(String, String)>> {
let mut contents = Vec::new();
file.read_to_end(&mut contents)?;
if contents.is_empty() {
return Ok(Vec::new());
}
if contents.len() % RECORD_SIZE != 0 {
return Err(io::Error::other(format!(
"file size ({}) is not a multiple of record size ({RECORD_SIZE})",
contents.len()
)));
}
contents
.chunks_exact(RECORD_SIZE)
.map(decode_record)
.collect()
let (key_bytes, value_bytes) =
data.split_at(HV_KVP_EXCHANGE_MAX_KEY_SIZE);
let key = std::str::from_utf8(key_bytes)
.map(|s| s.trim_end_matches('\0').to_owned())
.unwrap_or_default();
let value = std::str::from_utf8(value_bytes)
.map(|s| s.trim_end_matches('\0').to_owned())
.unwrap_or_default();
Ok((key, value))
}
/// Read all records from a file that is already open and locked.
fn read_all_records(file: &mut File) -> io::Result<Vec<(String, String)>> {
let metadata = file.metadata()?;
let len = metadata.len() as usize;
if len == 0 {
return Ok(Vec::new());
}
if len % RECORD_SIZE != 0 {
return Err(io::Error::other(format!(
"file size ({}) is not a multiple of record size ({RECORD_SIZE})",
len
)));
}
// Ensure we start reading from the beginning of the file.
file.seek(std::io::SeekFrom::Start(0))?;
let record_count = len / RECORD_SIZE;
let mut records = Vec::with_capacity(record_count);
let mut buf = [0u8; RECORD_SIZE];
for _ in 0..record_count {
file.read_exact(&mut buf)?;
records.push(decode_record(&buf)?);
}
Ok(records)

Copilot uses AI. Check for mistakes.
Comment on lines +578 to +599
fn test_truncate_if_stale_truncates_when_file_is_older_than_boot() {
let tmp = NamedTempFile::new().unwrap();
let store = hyperv_store(tmp.path());

store.write("key", "value").unwrap();
assert!(tmp.path().metadata().unwrap().len() > 0);

truncate_with_boot_time(tmp.path(), i64::MAX).unwrap();
assert_eq!(tmp.path().metadata().unwrap().len(), 0);
}

#[test]
fn test_truncate_if_stale_keeps_file_when_newer_than_boot() {
let tmp = NamedTempFile::new().unwrap();
let store = hyperv_store(tmp.path());

store.write("key", "value").unwrap();
let len_before = tmp.path().metadata().unwrap().len();

// Epoch boot time ensures any current file mtime is considered fresh.
truncate_with_boot_time(tmp.path(), 0).unwrap();
assert_eq!(tmp.path().metadata().unwrap().len(), len_before);
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are named as if they validate HyperVKvpStore::truncate_if_stale, but they call truncate_with_boot_time instead, so the public method (and its locking/mtime logic) is untested. Either update the tests to exercise store.truncate_if_stale() (ideally including the lock-skip path) or rename the tests to match what they actually cover.

Copilot uses AI. Check for mistakes.
Comment on lines +59 to +60
- If lock contention occurs (`WouldBlock`): return `Ok(())` and skip.
- Non-contention lock failures are returned as errors.
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs state that only WouldBlock lock contention is skipped and other lock errors are returned, but the current truncate_if_stale implementation returns Ok(()) on any try_lock_exclusive error. Either tighten the implementation to match this behavior, or adjust this section to reflect the actual error handling.

Suggested change
- If lock contention occurs (`WouldBlock`): return `Ok(())` and skip.
- Non-contention lock failures are returned as errors.
- If any lock error occurs (including `WouldBlock` contention): return `Ok(())` and skip truncation, leaving the file unchanged.
- Lock acquisition failures are treated as non-fatal and are not propagated as errors from this method.

Copilot uses AI. Check for mistakes.
Comment on lines +156 to +158
if FileExt::try_lock_exclusive(&file).is_err() {
return Ok(());
}
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

truncate_if_stale currently treats any try_lock_exclusive failure as lock contention and returns Ok(()). This can silently ignore real errors (e.g., permission/IO errors). Check the error kind and only skip on WouldBlock; propagate other errors (and consider updating the doc comment accordingly).

Copilot uses AI. Check for mistakes.
Comment on lines +244 to +253
let key = String::from_utf8(data[..HV_KVP_EXCHANGE_MAX_KEY_SIZE].to_vec())
.unwrap_or_default()
.trim_end_matches('\0')
.to_string();

let value =
String::from_utf8(data[HV_KVP_EXCHANGE_MAX_KEY_SIZE..].to_vec())
.unwrap_or_default()
.trim_end_matches('\0')
.to_string();
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decode_record uses String::from_utf8(...).unwrap_or_default(), which silently turns invalid UTF-8 into an empty key/value. This can cause data loss and incorrect behavior (e.g., inserting an empty key into entries() or missing matches in read()). Return an InvalidData/io::Error on UTF-8 decode failure instead of defaulting.

Suggested change
let key = String::from_utf8(data[..HV_KVP_EXCHANGE_MAX_KEY_SIZE].to_vec())
.unwrap_or_default()
.trim_end_matches('\0')
.to_string();
let value =
String::from_utf8(data[HV_KVP_EXCHANGE_MAX_KEY_SIZE..].to_vec())
.unwrap_or_default()
.trim_end_matches('\0')
.to_string();
let key_bytes = &data[..HV_KVP_EXCHANGE_MAX_KEY_SIZE];
let key = String::from_utf8(key_bytes.to_vec())
.map_err(|e| io::Error::new(ErrorKind::InvalidData, e))?;
let key = key.trim_end_matches('\0').to_string();
let value_bytes = &data[HV_KVP_EXCHANGE_MAX_KEY_SIZE..];
let value = String::from_utf8(value_bytes.to_vec())
.map_err(|e| io::Error::new(ErrorKind::InvalidData, e))?;
let value = value.trim_end_matches('\0').to_string();

Copilot uses AI. Check for mistakes.
@@ -1,126 +1,88 @@
# Azure-init Tracing System
# `libazureinit-kvp`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we delete this file in favor of the source rustdocs for now? I'd rather focus on great docs in the source modules and keep this at really high level, or delete it. There's a lot of duplication as it stands...

pub use hyperv::HyperVKvpStore;

/// Hyper-V key limit in bytes (policy/default preset).
pub const HYPERV_MAX_KEY_BYTES: usize = 512;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't be here (dupe of hyperv)

/// Key/value byte limits for write validation.
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub struct KvpLimits {
pub max_key_size: usize,
Copy link
Contributor

@cjp256 cjp256 Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's inline these into the trait

and have each implementation define them as their own const.

/// Consumers (e.g. diagnostics, tracing layers) should call this
/// instead of hardcoding size constants, so the limits stay correct
/// regardless of the underlying implementation.
fn limits(&self) -> KvpLimits;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove in favor of two consts.

///
/// Keys are deduplicated using last-write-wins semantics, matching
/// the behavior of [`read`](KvpStore::read).
fn entries(&self) -> io::Result<HashMap<String, String>>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we expose one more entries_raw(), or dump() which yields a list of key/value pairs that aren't necessarily unique?

this may be useful for testing or some dump command

/// Storage abstraction for KVP backends.
///
/// Semantics:
/// - `write`: stores one key/value or returns validation/I/O error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these details should be consolidated into the function docs. This should just be a high level doc for the trait.

e.g. key-value store that supports the semantics of Hyper-V's KVP while also being generic to support non-file-backed-pool-KVP in-memory implementation, for tests etc.

///
/// Returns an error if:
/// - The key is empty.
/// - The key exceeds the configured maximum key size.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are allowed values for the key characters? are there restrictions?


/// Write a key-value pair into the store.
///
/// Returns an error if:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these errors should be defined here so we can be specific about what is being returned under what conditions

//! - No record-count header or explicit record cap in this layer.
//!
//! ## Behavior summary
//! - **`write`**: append-only; one record appended per call.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leave these to function docs

//! - No record-count header or explicit record cap in this layer.
//!
//! ## Behavior summary
//! - **`write`**: append-only; one record appended per call.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep the module docs relevant to the Hyper-V KVP pool / protocol and don't include any rust implementations details. The implementation should flow from this description and be detailed in the impl docs.

Comment on lines +98 to +99
pub fn new_autodetect(path: impl Into<PathBuf>) -> Self {
let limits = if is_azure_vm(None) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is clever, but let's not do this. The caller should decide with implementation to work with if it's separated in two different implementations. If it's one implementation, then constrain it to a required bool in the new() call, e.g. new(azure=true)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants