Skip to content

Commit 5f4371f

Browse files
author
Shane Wall
committed
Add background scrub executor and reporting
1 parent 73b280a commit 5f4371f

6 files changed

Lines changed: 871 additions & 10 deletions

File tree

CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99

1010
### Added
1111

12+
- **Background Scrub Executor** (`crates/capsule-registry/src/scrub_executor.rs`)
13+
- `ScrubExecutor<B: StorageBackend>` drives the `common::scrub` types into a running integrity check
14+
- `scrub_cycle(config, kind)` — single pass returning a `ScrubReport`; respects `max_segments_per_cycle` and `inter_segment_delay` to avoid starving foreground I/O
15+
- `spawn_background(backend, config, key_manager)` — continuous Tokio task alternating light/deep cycles on schedule
16+
- Deep scrub applies the strongest check available per segment: BLAKE3-MAC (`encryption::verify_mac`) for encrypted segments; BLAKE3 content-hash recompute for unencrypted; segments with no stored hash/tag recorded as `Ok` (stable condition, nothing to compare)
17+
- `ScrubResult::Skipped` for transient conditions (e.g. no key manager); skipped segments are not recorded in the schedule and are retried next cycle
18+
- `ScrubReport` is `#[non_exhaustive]`; new counters `segments_skipped` and `segments_recorded` (both serde-defaulted for backward compat); `ScrubResult::should_record_schedule()` distinguishes definitive from transient outcomes; `segments_recorded` exposes how many schedule timestamps were actually advanced, letting external monitors detect all-transient cycles
19+
- 10 executor unit tests covering: healthy pass, bitrot detection, length mismatch, schedule throttling, cycle cap, skip semantics, stable-Ok for legacy encrypted segments (+ 17 scheduler tests in `common::scrub`)
20+
1221
- **Background Scrub Scheduler** (`crates/common/src/scrub.rs`)
1322
- Two-level scrubbing: light (metadata) and deep (content hash / MAC verification)
1423
- `ScrubConfig` with configurable intervals, per-cycle segment limits, and inter-segment delays
@@ -17,6 +26,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1726
- `ScrubReport` aggregates cycle results with duration and error counts
1827
- Deep scrub implicitly records a light-scrub timestamp to avoid redundant work
1928

29+
- **`Segment` and `SegmentId` now derive `Default`** (`crates/common/src/lib.rs`)
30+
- All fields have sensible zero-values; simplifies test construction and future struct-update syntax
31+
2032
- **Erasure Coding Trait & Types** (`crates/common/src/erasure.rs`)
2133
- `ErasureCode` trait with `encode`, `decode`, and `minimum_to_decode` methods
2234
- Supports pluggable algorithms: Reed-Solomon, ISA-L, LRC (locally repairable codes)

crates/capsule-registry/src/lib.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ pub mod gc;
2929
pub mod pipeline;
3030
#[cfg(feature = "podms")]
3131
pub mod runtime;
32+
pub mod scrub_executor;
3233

3334
pub use error::{CompressionError, DedupError, PipelineError};
3435
#[cfg(feature = "podms")]

0 commit comments

Comments
 (0)