feat(dag/walker): opt-in BloomTracker to avoid duplicated walks#1124
feat(dag/walker): opt-in BloomTracker to avoid duplicated walks#1124
Conversation
c16a5b7 to
4dad6c1
Compare
VisitedTracker interface for memory-efficient DAG traversal dedup. BloomTracker uses a scalable bloom filter chain (~4 bytes/CID vs ~75 for a map), enabling dedup on repos with tens of millions of CIDs. - BloomTracker: auto-scaling chain, configurable FP rate via BloomParams, unique random SipHash keys per instance (uncorrelated FPs across nodes) - MapTracker: exact dedup for tests and small datasets - *cid.Set satisfies the interface for drop-in compatibility - go.mod: update ipfs/bbloom to master (for NewWithKeys)
4dad6c1 to
c8962fc
Compare
iterative DFS walker that integrates VisitedTracker dedup directly into the traversal loop, skipping entire subtrees in O(1). - LinksFetcherFromBlockstore: extracts links from any codec registered in the global multicodec registry (dag-pb, dag-cbor, raw, etc.) - ~2x faster than legacy go-ipld-prime selector traversal (no selector machinery, simpler decoding, fewer allocations) - WithLocality option for MFS providers to skip non-local blocks - best-effort error handling: fetch failures log and skip, do not mark the CID as visited (allows retry via another pin or next cycle) - benchmarks comparing BlockAll vs WalkDAG across dag-pb, dag-cbor, and mixed-codec DAGs
19bf557 to
224c2ae
Compare
Codecov Report❌ Patch coverage is @@ Coverage Diff @@
## main #1124 +/- ##
==========================================
+ Coverage 62.72% 62.92% +0.19%
==========================================
Files 262 266 +4
Lines 26278 26564 +286
==========================================
+ Hits 16483 16715 +232
- Misses 8099 8135 +36
- Partials 1696 1714 +18
... and 8 files with indirect coverage changes 🚀 New features to boost your workflow:
|
emits entity roots (files, directories, HAMT shards) skipping internal file chunks. core of the +entities provide strategy. - NodeFetcherFromBlockstore: detects UnixFS entity type from the ipld-prime decoded node's Data field - directories and HAMT shards: emit and recurse into children - non-UnixFS codecs (dag-cbor, dag-json): emit and follow links - same options as WalkDAG: WithVisitedTracker, WithLocality - tests: dag-pb, raw, dag-cbor, mixed codecs, HAMT, dedup, error handling, stop conditions
catch unexpected regressions in ipfs/bbloom behavior or BloomParams derivation that would silently degrade the false positive rate. - measurable rate (1/1000): 100K probes produce observable FPs, asserts rate is within 5x of target - default rate (1/4.75M): 100K probes must produce exactly 0 FPs
- NewPrioritizedProvider: stream init error no longer stops remaining streams (e.g. MFS flush error does not prevent pinned content from being provided) - NewConcatProvider: concatenates pre-deduplicated streams without its own visited set, for use with shared VisitedTracker
…vider NewUniquePinnedProvider: emits all pinned blocks with cross-pin dedup via shared VisitedTracker (bloom or map). walks recursive pin DAGs first, then direct pins. NewPinnedEntityRootsProvider: same structure but uses WalkEntityRoots, emitting only entity roots and skipping internal file chunks. existing NewPinnedProvider is unchanged.
- remove unused daggen variable in uniquepinprovider_test.go
…tency match the defensive read-side ctx.Done select pattern already used by NewPrioritizedProvider in the same file
- deduplicate LinkSystem construction used by both LinksFetcherFromBlockstore and NodeFetcherFromBlockstore - wrap blockstore with NewIdStore so identity CIDs (multihash 0x00, data inline in the CID) are decoded without a datastore lookup
identity CIDs (multihash 0x00) embed data inline, so providing them to the DHT is wasteful. the walker now traverses through identity CIDs (following their links) but never emits them. - add isIdentityCID check to WalkDAG and WalkEntityRoots - simplify WalkEntityRoots emit/descend logic - tests for identity raw leaf, identity dag-pb directory with normal children, normal directory with identity child
- inline identity CID check (c.Prefix().MhType == mh.IDENTITY) in all emit paths: WalkDAG, WalkEntityRoots, and direct pin loops in both NewUniquePinnedProvider and NewPinnedEntityRootsProvider - move all identity CID tests to dag/walker/identity_test.go - add provider-level identity tests for direct pins and recursive DAGs
the stack-based DFS was pushing children in link order, causing the last child to be popped first (right-to-left). reverse children before pushing so the first link is on top and gets visited first. this matches the legacy fetcherhelpers.BlockAll selector traversal (ipld-prime iterates list/map entries in insertion order) and the conventional DFS order described in IPIP-0412. - walker.go, entity.go: slices.Reverse(children) before stack push - walker.go: document traversal order in WalkDAG godoc - entity.go: document order parity in WalkEntityRoots godoc - walker_test.go, entity_test.go: add sibling order regression tests
a corrupted pin entry was stopping the entire provide cycle because the goroutine returned on RecursiveKeys/DirectKeys error. change to continue so remaining pins are still provided (best-effort). the error from the pinner iterator already contains context (bad CID bytes, datastore key, etc.) -- sc.Pin.Key is zero-value on error so including it in the log would be noise. matches the best-effort pattern used in WalkDAG/WalkEntityRoots where fetch errors are logged and skipped.
- collectLinks: note that map keys are not recursed (no known codec uses link-typed map keys) - detectEntityType: extract c.Prefix() once for readability - grow: document MinBloomCapacity invariant that prevents small-bitset FP rate issues in grown blooms
gammazero
left a comment
There was a problem hiding this comment.
Made a few suggestions but nothing blocking.
uniquepinprovider: use skip-early style for tracker.Visit in direct pin loops (clearer control flow) visited.go: document that VisitedTracker implementations may be probabilistic, and must keep FP rate negligible or allow callers to adjust it
log capacity, FP rate, and hash parameters on creation. log previous/new capacity and chain length on autoscale. helps operators understand bloom sizing and detect unexpected growth during reprovide cycles.
counts Visit() calls that returned false (CID already seen). callers can log this after a reprovide cycle to show how much dedup the bloom filter achieved.
guillaumemichel
left a comment
There was a problem hiding this comment.
Neat implementation! A few comments inline, but nothing major
dag/walker/walker.go
Outdated
| slices.Reverse(children) | ||
| stack = append(stack, children...) | ||
|
|
||
| // skip identity CIDs: content is inline, no need to provide |
There was a problem hiding this comment.
If the dag walker has other purpose than providing, I would suggest leaving this filtering to the provide system.
There was a problem hiding this comment.
Or instead of blocking IDENTITY allow caller to pass a blocklist as option, so that provide systems can block IDENTITY?
There was a problem hiding this comment.
Good questions. I'd keep as-is, below are my thoughts why.
On Tracker
The tracker needs to be inside the walk loop for subtree pruning to work: when Visit() returns false, the walker skips the fetch and the entire subtree. If dedup was external (wrapping the emitted CIDs), the walker would still traverse every child of every node, losing the main performance benefit (O(unique_blocks) back to O(pins * total_blocks) I/O).
I'd keep as-is. The option is opt-in via WithVisitedTracker, so the walker could be "general-purpose" when no tracker is set.
On IDENTITY CIDs
I think TLDR is that IDENTITY CIDs are a can of worms, and I believe we should keep this as-is, skip them early, don't leak, and NOT make it configurable.
Rationale:
In my mind skipping identity CIDs is a correctness thing, rather than a policy choice: the data is inline in the CID itself, so no block exists to fetch or provide. Any node can derive the content from the CID without ANY the network or disk I/O.
Making it configurable would add API surface for a case where the answer is always the same, and would imply the skip is optional, which could lead to bugs when callers forget to set it.
I may be over-correcting, but truth to be told, we did see that over the years, and these bugs are hard to debug. The safest approach seems to always filter them out as soon as possible, and avoid leaking them to layers where they may trigger unnecessary IO or make people waste resources due to naive implementations.
Note: We do still descend into identity CID children (e.g. an inlined dag-pb directory node), just skip emitting the identity CID itself, which keeps "general purpose" use cases feasible.
| stack := []cid.Cid{root} | ||
|
|
||
| for len(stack) > 0 { | ||
| if ctx.Err() != nil { |
There was a problem hiding this comment.
nit: since checking ctx.Err() implies acquiring a mutex, we could check it only every 1k operations or similar?
There was a problem hiding this comment.
Good instinct, but i recently learned that ctx.Err() is an atomic load since Go 1.20, not a mutex:
func (c *cancelCtx) Err() error {
if err := c.err.Load(); err != nil { ...The loop body does blockstore I/O per iteration, so the atomic read is negligible. Batching the check every Nth iteration would also delay cancellation response by up to N block fetches.
So.. better to keep as-is?
Addresses review feedback: #1124 (comment) #1124 (comment)
The reprovide interval is configured by the caller, not by boxo. Addresses review feedback: #1124 (comment)
Extract shared walkLoop for the common iterative DFS logic (stack management, tracker dedup, locality check, identity CID skip, emit). WalkDAG and WalkEntityRoots now differ only in their fetch callback. Addresses review feedback: #1124 (comment)
Extract shared newPinnedProvider for the common goroutine, emit, pin iteration, and direct-pin logic. NewUniquePinnedProvider and NewPinnedEntityRootsProvider now differ only in the walk callback. Addresses review feedback: #1124 (comment)
These are new functions, not fixes to existing ones. Addresses review feedback: #1124 (comment)
|
Thanks for feedback and reviews! |
Note
Summary
New
dag/walkerpackage for memory-efficient DAG traversal with bloom filter deduplication, plus new pinned-provider strategies that use it to avoid re-announcing duplicate blocks across pins.dag/walker(new package)VisitedTrackerinterface with two implementations:BloomTracker-- auto-scaling bloom filter chain (~4 bytes/CID vs ~75 for a map), with uncorrelated false positives across nodes (unique random SipHash keys per instance)MapTracker-- exact dedup for tests and small datasetsWalkDAG-- iterative DFS traversal with integrated dedup, codec-agnostic link extraction (dag-pb, dag-cbor, raw, and any registered codec). ~2x faster than the legacygo-ipld-primeselector-based walker.WalkEntityRoots-- entity-aware traversal that emits only file/directory/HAMT shard roots instead of every block, skipping internal file chunks.pinnerNewUniquePinnedProvider-- emits all blocks reachable from pins with cross-pin bloom dedup (recursive DAGs first, then direct pins).NewPinnedEntityRootsProvider-- same but emits only entity roots viaWalkEntityRoots.providerNewPrioritizedProvidernow continues to the next stream when one fails instead of stopping all streams.NewConcatProvideradded for pre-deduplicated streams that don't need thecidutil.StreamingSetoverhead.Other