feat(core): add skiplist to CheckPoint for faster traversal#2048
feat(core): add skiplist to CheckPoint for faster traversal#2048evanlinjin wants to merge 3 commits intobitcoindevkit:masterfrom
Conversation
|
Guys, this is purely done by Claude. I haven't reviewed it yet. |
153c401 to
098c076
Compare
Performance Benchmark ComparisonBenchmarks comparing the old O(n) implementation vs new skiplist O(√n) implementation for a 10,000 checkpoint chain: 🎯 Key Results
📊 Detailed BenchmarksFinding checkpoint at position 100 (from 10k chain):
Finding checkpoint at position 9000 (from 10k chain):
* Note: The linear_traversal benchmark shows the new implementation is slightly slower because it's doing the same linear traversal but with additional overhead from the skip/index fields. The real performance gains come from using the skiplist-aware methods like SummaryThe skiplist implementation provides massive performance improvements for checkpoint lookups, especially for deep searches in long chains. The O(√n) complexity is clearly demonstrated with 200x+ speedups in real-world scenarios. |
5544fee to
4b9ccd1
Compare
It's now fully reviewed by myself! Made many simplifications. Let's merge #2055 and rebase this on top of that! |
Skiplist Performance UpdateAfter the optimizations, here are the updated benchmark results:
|
| Benchmark | Time | Notes |
|---|---|---|
get_100_near_start |
475.89 ns | Get checkpoint near start of 100-item chain |
get_1000_middle |
31.07 ns | Get checkpoint in middle of 1000-item chain |
get_10000_near_end |
57.12 ns | Get checkpoint near end of 10000-item chain |
get_10000_near_start |
535.37 ns | Get checkpoint near start of 10000-item chain |
floor_at() Performance
| Benchmark | Time | Notes |
|---|---|---|
floor_at_1000 |
286.33 ns | Floor at height 750 in 1000-item chain |
floor_at_10000 |
673.27 ns | Floor at height 7500 in 10000-item chain |
range() Performance
| Benchmark | Time | Notes |
|---|---|---|
range_1000_middle_10pct |
1.67 µs | Range 450..=550 in 1000-item chain |
range_10000_large_50pct |
97.59 µs | Range 2500..=7500 in 10000-item chain |
range_10000_from_start |
3.11 µs | Range ..=100 in 10000-item chain |
range_10000_near_tip |
1.21 µs | Range 9900.. in 10000-item chain |
range_single_element |
942.21 ns | Range 5000..=5000 in 10000-item chain |
Traversal Comparison
| Benchmark | Time | Notes |
|---|---|---|
linear_traversal_10000 |
140.90 µs | Linear search to height 100 in 10000-item chain |
skiplist_get_10000 |
539.80 ns | Skip-enhanced search to height 100 in 10000-item chain |
Speedup: 261x faster with skip pointers!
Summary
The skip list implementation successfully achieves O(√n) time complexity for search operations. Key improvements from our optimizations:
- Cleaner two-phase traversal in
get()andrange() - Simplified
floor_at()from 33 lines to 1 line - Restored elegant
insert()implementation (removed 60+ lines) - Refactored
push()with clearer skip pointer logic
All tests pass and the implementation is now both performant and maintainable.
1455ce9 to
8986465
Compare
|
I think it should have more than a single level to achieve the optimal performance, but I'm not sure if that's possible without implementing a new type. |
There was a problem hiding this comment.
This will be an improvement over linked list for sure, but I would like to combine efforts with block-graph/skiplist to avoid duplicated work.
I find some components of the skip list too tied up to the underlying data, making difficult to resonate about the functionality of the skip list by itself. I would like to detach the skiplist logic from the underlying data.
There is a lot of room for improvement for skiplist applied to our particular use case, so I'm positive there will be a lot of changes to a structure like it. I would try to make smaller PRs to check the improvements, but get results early.
In that sense, the benchmark is going to be very handy to ensure we are making progress.
A one level, fixed skip interval is a great starting point for that.
86b0169 to
625b581
Compare
@ValuedMammal I think the current state of the PR is a good balance between performance and simplicity (for now).
Here is a performance summary by Claude:
|
The current state of the PR is self-contained and reviewable - the changes are all internal changes to Are you wanting to detach because you see this skiplist logic being used elsewhere? The current skiplist implementation has domain-specific simplifications which won't exist in a full skiplist implementation (i.e. append-only in our
I don't think splitting this PR into multiple PRs makes sense here. This is a single atomic change to We may need a new internal type to make multi-level skiplist possible, and thus achieve optimal performance. I'm not opposed to this idea. However, I think this PR is a simple change that achieves good enough performance. |
|
You've exposed good points and I agree with all of them. Will review again |
625b581 to
379981e
Compare
|
I was about to open a PR for this. I'm hitting this problem running an LDK Node on mutinynet with bitcoind as chain source where there's a 2-second poll that calls I took a different approach adding a |
|
@martinsaposnic would you be open to benchmark your case against this branch? |
379981e to
da63591
Compare
Adds a skip pointer (every 100 checkpoints by index) and an index field to CheckPoint to accelerate get(), floor_at(), and range(). push() and insert() maintain the index/skip invariants on the rebuilt chain. This is a ~100x constant-factor speedup on dense chains, not a true O(sqrt(n)) bound: a fixed interval k gives O(n/k + k), which is asymptotically linear. For BDK's realistic size regime (up to ~1M dense checkpoints on a server), that constant factor is ample -- a full-chain get() drops from ~1M pointer chases to ~10k (tens of microseconds). Motivated by the server-side scenario reported by @martinsaposnic in bitcoindevkit#2048 where Wallet::transactions() exceeded a 2-second poll budget on LDK-on-mutinynet with bitcoind as chain source, due to O(n) checkpoint traversal over a dense chain. See: bitcoindevkit#2048 (comment) Benchmarks show ~265x speedup for deep searches in 10k checkpoint chains (linear traversal ~108us vs skiplist get ~407ns). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
da63591 to
77801bf
Compare
|
Pushed a follow-up commit bumping RationaleTraversal cost with a fixed skiplist interval Concretely
MemoryNo meaningful difference — TradeoffChains shorter than ~1000 nodes no longer create any skip pointers. That's fine — linear traversal at Left (Edited: corrected an inaccurate claim about memory overhead — |
bfb6a64 to
09ef0b7
Compare
Adds a skip pointer (every 100 checkpoints by index) and an index field to CheckPoint to accelerate get(), floor_at(), and range(). push() and insert() maintain the index/skip invariants on the rebuilt chain. This is a ~100x constant-factor speedup on dense chains, not a true O(sqrt(n)) bound: a fixed interval k gives O(n/k + k), which is asymptotically linear. For BDK's realistic size regime (up to ~1M dense checkpoints on a server), that constant factor is ample -- a full-chain get() drops from ~1M pointer chases to ~10k (tens of microseconds). Motivated by the server-side scenario reported by @martinsaposnic in bitcoindevkit#2048 where Wallet::transactions() exceeded a 2-second poll budget on LDK-on-mutinynet with bitcoind as chain source, due to O(n) checkpoint traversal over a dense chain. See: bitcoindevkit#2048 (comment) Benchmarks show ~265x speedup for deep searches in 10k checkpoint chains (linear traversal ~108us vs skiplist get ~407ns). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
09ef0b7 to
75e6fdd
Compare
Adds a skip pointer (every 100 checkpoints by index) and an index field to CheckPoint to accelerate get(), floor_at(), and range(). push() and insert() maintain the index/skip invariants on the rebuilt chain. This is a ~100x constant-factor speedup on dense chains, not a true O(sqrt(n)) bound: a fixed interval k gives O(n/k + k), which is asymptotically linear. For BDK's realistic size regime (up to ~1M dense checkpoints on a server), that constant factor is ample -- a full-chain get() drops from ~1M pointer chases to ~10k (tens of microseconds). Motivated by the server-side scenario reported by @martinsaposnic in bitcoindevkit#2048 where Wallet::transactions() exceeded a 2-second poll budget on LDK-on-mutinynet with bitcoind as chain source, due to O(n) checkpoint traversal over a dense chain. See: bitcoindevkit#2048 (comment) Benchmarks show ~265x speedup for deep searches in 10k checkpoint chains (linear traversal ~108us vs skiplist get ~407ns). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The skiplist's fixed interval k trades off as O(n/k + k), minimized when k ~ sqrt(n). The motivating workload (dense server chains near the current Bitcoin tip, n ~ 1M) sits far above the k=100 sweet spot of n ~ 10k. Bumping to k=1000 brings the interval closer to sqrt(1M) and yields ~5x better worst-case traversal for that case (roughly 2k hops instead of 10k). Memory is unchanged: Option<Arc<CPInner>> is niche-optimized to 8 bytes regardless of Some/None, so every node carries the same skip field, and skip pointers reference existing chain nodes (no new heap allocations -- just refcount bumps on already-allocated ArcInners). k only affects traversal performance, not footprint. Smaller chains (n < 1000) no longer gain anything from the skiplist, but linear traversal at that scale is already microseconds -- not a workload we need to optimize. Update test_skiplist_indices to verify skip pointer placement at the new interval using a 5000-node chain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
75e6fdd to
aeb4d00
Compare
Addresses review feedback from @nymius: existing benches use fixed targets, which can land favorably or unfavorably relative to skip pointer positions and don't reflect real query patterns. The new bench draws 256 targets from a deterministic xorshift sequence and runs both a skiplist-enhanced get() and a plain linear walk over a 100k-node chain, so the same query stream exercises both paths. 100k is large enough to show the skiplist win clearly (100× fewer hops at k=1000) without slowing harness setup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
I recommend looking at how it's done in block-graph and read up on the blog post for a conceptual overview. |
Description
Adds a
skippointer andindexfield toCheckPointto accelerate traversal on dense chains. A single skip pointer is set every 1000 checkpoints (by index), giving a several-hundred-× constant-factor speedup forget(),floor_at(), andrange()on chains near the current Bitcoin tip height.This is a constant-factor speedup, not O(√n) — a fixed interval
kgives O(n/k + k), still asymptotically linear. For BDK's realistic size regime, though, the constant factor is ample: on a ~1M-checkpoint dense chain, positional lookups drop from ~1M hops to roughly 1–2k (tens of microseconds), well under any sensible latency budget.Why it's useful
@martinsaposnic reported in this comment that
Wallet::transactions()exceeded a 2-second poll budget on LDK-on-mutinynet with bitcoind as chain source, due to O(n) checkpoint traversal over a dense chain. The skiplist pulls this scenario far under budget.Notes to the reviewers
k = 1000was chosen because the fixed-interval skiplist is minimized atk ≈ √n, and the motivating workload sits aroundn ≈ 1M(√n ≈ 1000). Smaller values bias the win toward chains ofn ≈ 10k, where linear is already fast enough.insert()rebuilds the affected portion of the chain via the existingextend/pushpaths, so index/skip invariants are maintained automatically.Changelog notice
Added
skippointer andindexfields onCheckPoint.get(),floor_at(), andrange()on dense chains.Checklists
All Submissions:
New Features:
🤖 Generated with Claude Code