Skip to content

Latest commit

 

History

History
288 lines (204 loc) · 9.47 KB

File metadata and controls

288 lines (204 loc) · 9.47 KB

Phase 2.2: Deduplication Implementation

Overview

SPACE now implements content-addressed deduplication at the segment level, operating on compressed data. This proves the architectural claim from patentable_concepts.md: "dedupe over encrypted ciphertext" (encryption comes in Phase 3).

Architecture

Flow Diagram

Input Data │ ├─► Split into 4MB segments │ ├─► Compress each segment (LZ4/Zstd) │ ├─► Hash compressed data (BLAKE3) │ ├─► Check content store │ ├─ Hit? → Reuse existing segment │ └─ Miss? → Write new segment, register hash │ └─► Build capsule metadata

Key Design Decisions

  1. Post-Compression Deduplication

    • Hash is computed on compressed data, not raw data
    • Rationale: Proves "dedupe over ciphertext" concept (encryption = Phase 3)
    • Trade-off: Lower dedup ratio than pre-compression, but more realistic
  2. BLAKE3 for Content Hashing

    • Fast (1-2 GB/s single-threaded)
    • Cryptographically secure (preparation for Phase 3)
    • 32-byte hash = 64 hex characters
  3. Content Store Design

    • Simple HashMap: ContentHash → SegmentId
    • Stored in space.db alongside capsule registry
    • No bloom filter yet (Phase 3 optimization)
  4. Reference Counting

    • Each segment tracks ref_count (currently not enforced)
    • Preparation for garbage collection (Phase 3)
    • Capsules track deduped_bytes for monitoring
  5. Zero-Copy Compression Fast-Path

    • Compression returns Cow<[u8]>, borrowing the original slice when compression is skipped
    • Hashing, dedup lookups, and optional encryption operate directly on the borrowed buffer
    • Only segments that actually compress or encrypt allocate new Vec<u8>
  6. Transactional Staging for Async Writes

    • When the pipeline_async feature is enabled, WritePipeline saturates Tokio workers and stages new segment data inside NvramTransaction
    • Dedup hits are handled in two phases: staged reuse (within the same capsule) updates pending segment refcounts, while persistent reuse increments existing segments lazily and is rolled back if the transaction aborts
    • Content-store registration is deferred until the transaction successfully commits, guaranteeing that hashes only point at durable on-disk segments

Files Modified

Core Implementation

File Changes Purpose
common/src/lib.rs Added ContentHash, deduped_bytes, ref_count Type definitions
capsule-registry/src/dedup.rs NEW Hashing logic + stats
capsule-registry/src/lib.rs Added content_store, lookup/register methods Content-addressed storage
capsule-registry/src/pipeline.rs Integrated dedup into write path Main logic
nvram-sim/src/lib.rs Updated Segment initialization Metadata support

Dependencies

Crate Version Purpose
blake3 1.5 Content hashing
hex 0.4 Hash encoding

API Usage

Enable/Disable Deduplication

// Default policy has dedup enabled let policy = Policy::default(); assert!(policy.dedupe);

// Disable dedup for pre-compressed data let policy = Policy::precompressed(); assert!(!policy.dedupe);

Write with Deduplication

let registry = CapsuleRegistry::new(); let nvram = NvramLog::open("space.nvram")?; let pipeline = WritePipeline::new(registry, nvram);

// Automatic deduplication let capsule_id = pipeline.write_capsule(data)?;

// Output shows dedup hits: // ♻️ Dedup hit: Reusing segment 5 (saved 4194304 bytes)

Check Dedup Statistics

let (total_segments, unique_segments) = registry.get_dedup_stats(); let dedup_ratio = total_segments as f32 / unique_segments as f32;

println!("Deduplication ratio: {:.2}x", dedup_ratio);

Performance Characteristics

Time Complexity

Operation Complexity Notes
Hash computation O(n) n = segment size (4MB), ~2ms @ 2GB/s
Content lookup O(1) HashMap lookup
Segment write O(1) Append-only log

Space Savings

Test Results (from dedup_test.rs):

Scenario Dedup Ratio Notes
3 identical 6MB capsules ~3.0x Perfect dedup
Repeated patterns 1.5-2.5x Segment-level granularity
Random data 1.0x No dedup (expected)

Overhead

  • Hash computation: ~2ms per 4MB segment (negligible)
  • Metadata overhead: 64 bytes per unique segment (hash)
  • Memory: Content store scales with unique segment count
  • Allocations: Zero-copy path avoids cloning for entropy-skipped segments, improving large transfer latency by ~10-20% in internal profiling

Testing

Unit Tests

Test content hashing

cargo test -p capsule-registry hash_content

Test dedup stats tracking

cargo test -p capsule-registry dedup_stats

Integration Tests

Full dedup test suite

cargo test --test dedup_test -- --nocapture

Specific scenarios

cargo test --test dedup_test test_dedup_identical_segments cargo test --test dedup_test test_dedup_multiple_capsules

Manual Testing

Run the dedup demo script

chmod +x scripts/test_dedup.sh ./scripts/test_dedup.sh

Metadata Format

Content Store (in space.db)

{ "content_store": { "a1b2c3...": 42, "d4e5f6...": 43 }, "capsules": { "550e8400-...": { "segments": [42, 42, 43], "deduped_bytes": 4194304 } } }

Interpretation:

  • Segments 42 used twice (deduped once)
  • Capsule saved 4MB via deduplication

Phase 3.2: Garbage Collection Implementation

  • Reference-counted segments: Segment.ref_count is updated on every dedup hit and capsule deletion, ensuring shared segments stay consistent across capsules.
  • Startup reconciliation: WritePipeline::new rebuilds refcounts from persisted capsule metadata, fixing drift after crashes or manual edits.
  • Garbage collector: gc::GarbageCollector walks NvramLog::list_segments() and removes segments whose refcount hit zero, pruning both metadata and content-store entries.
  • Deletion path: WritePipeline::delete_capsule decrements segment refcounts and drops orphaned hashes immediately, keeping capsules and segments in sync.
  • Regression tests: gc_test.rs covers multi-capsule refcounts and orphan sweeping to guard against regressions.

Known Limitations

Phase 2.2 Scope

  1. No log-space reclamation yet

    • Freed segment bytes remain in the append-only log until a future compaction pass
    • Free-list based reuse is planned for Phase 3.3
  2. No Cross-Node Dedup

    • Content store is local per node
    • Federation/clustering = Phase 4
  3. No Bloom Filter

    • Content store grows with unique segments
    • May need optimization for 1M+ segments (Phase 3)
  4. Fixed 4MB Granularity

    • Small duplicates (<4MB) not detected
    • Variable-length dedup = future optimization

Future Enhancements (Phase 3+)

Phase 3: Security Enhancements

  • Encrypt segments after dedup (deterministic IV)
  • Introduce free-list or compaction to reclaim log space
  • Add bloom filter for negative lookups

Phase 4: Scale

  • Distributed content store (across nodes)
  • Variable-length deduplication (rolling hash)
  • GPU-accelerated bloom filter (as per patent doc)

Validation Against Patent Claims

From ../patentable_concepts.md § 3:

Per-Segment Encryption with Inline Dedup & Compression

Encrypt XTS-AES-256 per 256 MiB segment after compression + dedupe yet retain global dedupe across ciphertext via deterministic IV derivation.

Phase 2.2 Status:

✅ Compression before dedup: IMPLEMENTED
✅ Content-addressed storage: IMPLEMENTED
✅ Hash-based dedup: IMPLEMENTED
⏳ Encryption with deterministic IV: Phase 3
⏳ Global dedupe across encrypted data: Phase 3

Proof of Concept: This implementation validates that post-compression deduplication is viable and will extend cleanly to post-encryption dedup in Phase 3.

Troubleshooting

Dedup Not Occurring

Symptom: All segments unique despite identical data

Causes:

  1. Different compression algorithms between writes
  2. Dedup disabled in policy (policy.dedupe = false)
  3. Data genuinely unique (check entropy)

Debug:

  • Use cargo test -p capsule-registry dedup_test -- --nocapture to print dedup stats (sled data files are binary and not human-readable).
  • Enable verbose logging: RUST_LOG=debug cargo run -- create --file test.txt

High Memory Usage

Symptom: Memory grows with capsule count

Cause: Content store keeps all hashes in memory

Solution (Phase 3):

  • Implement bloom filter
  • Offload to external KV store (FoundationDB)

Performance Benchmarks

Run the benchmark suite:

cargo bench --bench dedup_bench

Expected Results:

  • Hash computation: ~2ms per 4MB segment
  • Content lookup: <1μs (HashMap)
  • Dedup overhead: <1% of total write time

Summary

Phase 2.2 successfully implements content-addressed deduplication as a foundation for the vision outlined in the architecture documents. The implementation:

  • ✅ Deduplicates at segment granularity (4MB)
  • ✅ Operates on compressed data (not plaintext)
  • ✅ Uses cryptographic hashing (BLAKE3)
  • ✅ Preserves data integrity across all test scenarios
  • ✅ Provides foundation for encrypted dedup (Phase 3)
  • ✅ Maintains performance (<1% overhead)
  • ✅ Scales to thousands of segments

The next phase will add per-segment encryption with deterministic IVs, proving the complete "dedupe over ciphertext" concept described in the patent documentation.