perf: eliminate redundant node clones during `Trie::recalculate_root_hash()` by cylewitruk-stacks · Pull Request #6935 · stacks-network/stacks-core

cylewitruk-stacks · 2026-02-26T12:41:48Z

Description

Background

Trie::recalculate_root_hash() is called on every MARF insert (regardless of trie hashing mode).

Currently, all ancestor nodes are pre-flushed with a zero-hash prior to being written with their final hash, which results in two write_nodetype() calls (and two full-node clones) per node.

The pre-flush exists to materialize a node's child pointers before get_trie_root_hash() performs Merkle skiplist ancestor lookups, but that lookup only happens for root nodes, and only when update_skiplist == true (i.e., non-Deferred hashing modes). Non-root ancestors never trigger ancestor lookups, and neither does the root in Deferred mode.

The Change

This PR gates the two-write pattern on is_root_node && update_skiplist and thus eliminates one full-node clone per-ancestor-per-insert.

Applicable issues

Not tied to a specific issue; part of the 100x chain optimization effort.

Additional info (benefits, drawbacks, caveats)

Core Optimization:

New TrieStorageConnection::write_node_hash() function which updates only a node's stored hash without cloning and overwriting the full node body.
Non-root ancestor nodes: write final hash directly via write_nodetype() — one clone instead of two (both hash modes).
Root nodes:
- In Deferred mode: write final hash directly via write_nodetype() — one clone instead of two (no skiplist lookup, no pre-flush needed).
- In Immediate (and All) modes: child pointers are still pre-flushed via write_nodetype() before skiplist hash derivation, but the final write is now a hash-only update via write_node_hash() — one clone instead of two.

Minor Hygiene:

Some minor trace! & comments cleanup.

Benchmark Diff:

Using marf-bench in PR #6932
Write path allocation counts down between 3-8%, allocated bytes down 3-18% (with synthetic benchmark data) and some minor improvements in timings.

[marf-bench] Comparison summary
values: base:0317850e7f042de98e7bc6a1f26f6183e7d20f98 / target:36c78bd17a8cde53fd5f3700ad117d21dbc6ac5c / %delta
benchmark  name                                       total(ms) b/t       Δ  alloc_count b/t      Δ  alloc_bytes b/t       Δ
----------------------------------------------------------------------------------------------------------------------------
write      node256/depth=1/begin_block                  0.253/0.201  -20.6%          280/272  -2.9%    955098/929754   -2.7%
write      node256/depth=1/commit_flush                 8.959/9.705   +8.3%            12/12  +0.0%      68704/68704   +0.0%
write      node256/depth=1/fill_node16_to_capacity      0.134/0.084  -37.3%          296/272  -8.1%    415320/339288  -18.3%
write      node256/depth=1/fill_node48_to_capacity      0.362/0.229  -36.7%         1034/968  -6.4%  1346504/1141896  -15.2%
write      node256/depth=1/fill_node4_to_capacity       0.035/0.023  -34.3%            80/74  -7.5%     104028/85020  -18.3%
write      node256/depth=1/insert_first_leaf            0.023/0.015  -34.8%            56/54  -3.6%      75396/69060   -8.4%
write      node256/depth=1/promote_node16_to_node48     0.027/0.018  -33.3%            72/66  -8.3%      84288/69760  -17.2%
write      node256/depth=1/promote_node48_to_node256    0.027/0.020  -25.9%            78/72  -7.7%    125696/106688  -15.1%
write      node256/depth=1/promote_node4_to_node16      0.024/0.016  -33.3%            56/52  -7.1%      69440/56768  -18.2%
write      node256/depth=1/seal                         0.119/0.082  -31.1%          168/168  +0.0%      65700/65700   +0.0%
write      node256/depth=1/split_leaf_to_node4          0.028/0.019  -32.1%            70/66  -5.7%      69868/57196  -18.1%
write      noop/depth=1/begin_block                     0.334/0.276  -17.4%          274/266  -2.9%    945898/920554   -2.7%
write      noop/depth=1/commit_flush                    9.410/8.641   -8.2%            12/12  +0.0%      68704/68704   +0.0%
write      noop/depth=1/fill_node16_to_capacity         0.146/0.126  -13.7%          296/272  -8.1%    415320/339288  -18.3%
write      noop/depth=1/fill_node48_to_capacity         0.396/0.336  -15.2%         1034/968  -6.4%  1346504/1141896  -15.2%
write      noop/depth=1/fill_node4_to_capacity          0.039/0.033  -15.4%            80/74  -7.5%     104028/85020  -18.3%
write      noop/depth=1/insert_first_leaf               0.029/0.024  -17.2%            63/60  -4.8%      72632/63128  -13.1%
write      noop/depth=1/promote_node16_to_node48        0.030/0.025  -16.7%            72/66  -8.3%      84288/69760  -17.2%
write      noop/depth=1/promote_node48_to_node256       0.034/0.028  -17.6%            78/72  -7.7%    125696/106688  -15.1%
write      noop/depth=1/promote_node4_to_node16         0.029/0.024  -17.2%            63/59  -6.3%      69661/56989  -18.2%
write      noop/depth=1/seal                            0.133/0.114  -14.3%          168/168  +0.0%      65697/65697   +0.0%
write      noop/depth=1/split_leaf_to_node4             0.030/0.025  -16.7%            63/59  -6.3%      69654/56982  -18.2%
----------------------------------------------------------------------------------------------------------------------------

[marf-bench] Repeated comparison stats
baseline: base:0317850e7f042de98e7bc6a1f26f6183e7d20f98
comparison: target:36c78bd17a8cde53fd5f3700ad117d21dbc6ac5c
values: median/min/max %delta across 10 repeats
benchmark  name                                        total Δ med   total Δ min   total Δ max   count Δ med   bytes Δ med  repeats
-----------------------------------------------------------------------------------------------------------------------------------
write      node256/depth=1/begin_block                       -4.1%        -20.6%         +9.1%         -2.9%         -2.7%       10
write      node256/depth=1/commit_flush                      +1.8%        -21.7%        +12.5%         +0.0%         +0.0%       10
write      node256/depth=1/fill_node16_to_capacity           -3.0%        -37.3%         +6.3%         -8.1%        -18.3%       10
write      node256/depth=1/fill_node48_to_capacity           -2.3%        -36.7%         +1.8%         -6.4%        -15.2%       10
write      node256/depth=1/fill_node4_to_capacity            +0.0%        -34.3%         +6.3%         -7.5%        -18.3%       10
write      node256/depth=1/insert_first_leaf                 +0.0%        -34.8%         +9.1%         -3.6%         -8.4%       10
write      node256/depth=1/promote_node16_to_node48          +0.0%        -33.3%        +16.7%         -8.3%        -17.2%       10
write      node256/depth=1/promote_node48_to_node256         -3.6%        -25.9%        +21.4%         -7.7%        -15.1%       10
write      node256/depth=1/promote_node4_to_node16           +0.0%        -33.3%         +0.0%         -7.1%        -18.2%       10
write      node256/depth=1/seal                              +0.0%        -31.1%         +5.4%         +0.0%         +0.0%       10
write      node256/depth=1/split_leaf_to_node4               +0.0%        -32.1%         +0.0%         -5.7%        -18.1%       10
write      noop/depth=1/begin_block                         -10.5%        -21.9%        +15.7%         -2.9%         -2.7%       10
write      noop/depth=1/commit_flush                         +1.5%        -10.8%        +11.6%         +0.0%         +0.0%       10
write      noop/depth=1/fill_node16_to_capacity              -2.5%        -13.7%         +1.7%         -8.1%        -18.3%       10
write      noop/depth=1/fill_node48_to_capacity              -2.1%        -15.3%         +5.3%         -6.4%        -15.2%       10
write      noop/depth=1/fill_node4_to_capacity               -2.9%        -30.4%        +12.5%         -7.5%        -18.3%       10
write      noop/depth=1/insert_first_leaf                    -7.1%        -36.8%         +8.3%         -4.8%        -13.1%       10
write      noop/depth=1/promote_node16_to_node48             -3.6%        -16.7%        +21.4%         -8.3%        -17.2%       10
write      noop/depth=1/promote_node48_to_node256           -10.0%        -33.3%         +7.1%         -7.7%        -15.1%       10
write      noop/depth=1/promote_node4_to_node16              +0.0%        -17.2%         +8.3%         -6.3%        -18.2%       10
write      noop/depth=1/seal                                 -3.3%        -14.3%         +3.5%         +0.0%         +0.0%       10
write      noop/depth=1/split_leaf_to_node4                  +0.0%        -82.1%         +0.0%         -6.3%        -18.2%       10
-----------------------------------------------------------------------------------------------------------------------------------

[marf-bench] Repeat confidence summary
baseline: base:0317850e7f042de98e7bc6a1f26f6183e7d20f98
comparison: target:36c78bd17a8cde53fd5f3700ad117d21dbc6ac5c
values: total_ms stability across 10 repeats
rows: total=22 stable=9 high-jitter=13  (high-jitter means min<0<max and spread>=30.0%)
top high-jitter rows (by spread):
  write / node256/depth=1/promote_node16_to_node48  median=+0.0%  min=-33.3%  max=+16.7%  spread=50.0%
  write / node256/depth=1/promote_node48_to_node256  median=-3.6%  min=-25.9%  max=+21.4%  spread=47.4%
  write / noop/depth=1/insert_first_leaf  median=-7.1%  min=-36.8%  max=+8.3%  spread=45.2%
  write / node256/depth=1/insert_first_leaf  median=+0.0%  min=-34.8%  max=+9.1%  spread=43.9%
  write / node256/depth=1/fill_node16_to_capacity  median=-3.0%  min=-37.3%  max=+6.3%  spread=43.7%
  write / noop/depth=1/fill_node4_to_capacity  median=-2.9%  min=-30.4%  max=+12.5%  spread=42.9%
  write / node256/depth=1/fill_node4_to_capacity  median=+0.0%  min=-34.3%  max=+6.3%  spread=40.5%
  write / noop/depth=1/promote_node48_to_node256  median=-10.0%  min=-33.3%  max=+7.1%  spread=40.5%
  write / node256/depth=1/fill_node48_to_capacity  median=-2.3%  min=-36.7%  max=+1.8%  spread=38.5%
  write / noop/depth=1/promote_node16_to_node48  median=-3.6%  min=-16.7%  max=+21.4%  spread=38.1%

Checklist

Test coverage for new or modified code paths
Changelog is updated

…ite-elision

Copilot

Pull request overview

This PR optimizes MARF trie insert performance by reducing redundant node cloning/writes during Trie::recalculate_root_hash(), especially when computing the root hash with skiplist updates enabled.

Changes:

Gate the “pre-flush then final write” behavior to only the root node when update_skiplist == true.
Add TrieStorageConnection::write_node_hash() to update only a node’s stored hash in the uncommitted trie state (avoiding a full node clone/write).
Minor logging/message and comment hygiene updates in trie operations.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`stackslib/src/chainstate/stacks/index/trie.rs`	Avoids redundant pre-flush writes for non-root ancestors; uses hash-only update for the root’s final write when skiplist updating.
`stackslib/src/chainstate/stacks/index/storage.rs`	Introduces `TrieStorageConnection::write_node_hash()` to support hash-only updates in uncommitted state.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

stackslib/src/chainstate/stacks/index/trie.rs

stackslib/src/chainstate/stacks/index/storage.rs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

codecov · 2026-02-26T14:22:10Z

Codecov Report

❌ Patch coverage is 72.41379% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.02%. Comparing base (b61a766) to head (f6f9f82).

Files with missing lines	Patch %	Lines
stackslib/src/chainstate/stacks/index/storage.rs	63.63%	4 Missing ⚠️
stackslib/src/chainstate/stacks/index/trie.rs	77.77%	4 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           develop    #6935       +/-   ##
============================================
+ Coverage    59.40%   70.02%   +10.61%     
============================================
  Files          412      412               
  Lines       218187   218190        +3     
  Branches       338      338               
============================================
+ Hits        129610   152778    +23168     
+ Misses       88577    65412    -23165

Files with missing lines	Coverage Δ
stackslib/src/chainstate/stacks/index/storage.rs	`80.79% <63.63%> (+0.36%)`	⬆️
stackslib/src/chainstate/stacks/index/trie.rs	`87.09% <77.77%> (+4.01%)`	⬆️

... and 317 files with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b61a766...f6f9f82. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

federico-stacks

the change makes sense to me. Just added a nit

federico-stacks · 2026-02-26T14:31:46Z

stackslib/src/chainstate/stacks/index/trie.rs

+                let root_ptr = storage.root_trieptr();
+                let is_root_node = node.is_node256() && ptr == root_ptr;
+
+                if is_root_node && update_skiplist {


nit: I noticed this gate condition appears twice within the method. Would it make sense to extract it into a local variable (e.g., should_flush_root or a similar name) to avoid duplication and improve readability?

I was actually concerned that that might reduce readability instead as this already reads pretty clearly. The "cleanest" alternative to me probably would have been to let is_preflushed_root = if is_root_node && update_skiplist { ...; true } else { false };-ish and gated the below on that, which would then read more like a control flow over a previous side-effect. 🤷

not a blocker at all! Feel free to resolve the comment if you think otherwise :)

I'll leave it open to see if anyone else has any preferences :) I don't have any strong opinions either way.

jcnelson · 2026-03-03T18:48:51Z

stackslib/src/chainstate/stacks/index/trie.rs

+                    // Flush the root node's pointers before calculating the skiplist hash.
+                    // Root hash derivation performs ancestor lookups that expect the current
+                    // trie structure to be materialized. Not needed when skipping the skiplist.
+                    storage.write_nodetype(ptr.ptr(), &node, TrieHash::default())?;


Nit: it's equivalent code, but just for defensive purposes, can you continue to use TrieHash([0; 32])?

How would you feel about using a TrieHash::ZERO or similar const (here and elsewhere)? I'm just a bit averse to inline "magic values" -- but I'm fine reverting it as well.

stackslib/src/chainstate/stacks/index/trie.rs

jcnelson · 2026-03-03T18:50:40Z

stackslib/src/chainstate/stacks/index/trie.rs

-                // however, since we're going to update the hash in the next write anyways, just write an empty buff
-                storage.write_nodetype(ptr.ptr(), &node, TrieHash([0; 32]))?;
+                let root_ptr = storage.root_trieptr();
+                let is_root_node = node.is_node256() && ptr == root_ptr;


nit: ptr == root_ptr should be both necessary and sufficient. The code already ensures that the root is always a node256.

You're right, I must've overlooked the explicit check above 👍 Will fix when I'm in a place to switch back to this branch.

stackslib/src/chainstate/stacks/index/trie.rs

jcnelson

Couple nits but LGTM. Will approve once resolved. Thanks @cylewitruk-stacks!

cylewitruk-stacks added 2 commits February 26, 2026 13:25

eliminate redundant node clones during trie root hash recalculation

36c78bd

Merge remote-tracking branch 'upstream/develop' into perf/marf-put-wr…

0225893

…ite-elision

cylewitruk-stacks requested a review from Copilot February 26, 2026 12:41

Copilot started reviewing on behalf of cylewitruk-stacks February 26, 2026 12:42 View session

Copilot AI reviewed Feb 26, 2026

View reviewed changes

stackslib/src/chainstate/stacks/index/trie.rs Outdated Show resolved Hide resolved

stackslib/src/chainstate/stacks/index/storage.rs Show resolved Hide resolved

Apply suggestion from @Copilot

f6f9f82

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

cylewitruk-stacks requested review from aaronb-stacks, federico-stacks and jcnelson February 26, 2026 13:24

federico-stacks reviewed Feb 26, 2026

View reviewed changes

federico-stacks approved these changes Feb 26, 2026

View reviewed changes

jcnelson reviewed Mar 3, 2026

View reviewed changes

stackslib/src/chainstate/stacks/index/trie.rs Show resolved Hide resolved

jcnelson reviewed Mar 3, 2026

View reviewed changes

stackslib/src/chainstate/stacks/index/trie.rs Show resolved Hide resolved

jcnelson reviewed Mar 3, 2026

View reviewed changes

stackslib/src/chainstate/stacks/index/trie.rs Show resolved Hide resolved

jcnelson reviewed Mar 3, 2026

View reviewed changes

stackslib/src/chainstate/stacks/index/trie.rs Show resolved Hide resolved

jcnelson reviewed Mar 3, 2026

View reviewed changes

Conversation

cylewitruk-stacks commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Background

The Change

Applicable issues

Additional info (benefits, drawbacks, caveats)

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

federico-stacks left a comment

Choose a reason for hiding this comment

Uh oh!

federico-stacks Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

cylewitruk-stacks Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

federico-stacks Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

cylewitruk-stacks Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jcnelson Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

cylewitruk-stacks Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jcnelson Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

cylewitruk-stacks Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jcnelson left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cylewitruk-stacks commented Feb 26, 2026 •

edited

Loading

codecov bot commented Feb 26, 2026 •

edited

Loading

cylewitruk-stacks Feb 26, 2026 •

edited

Loading

cylewitruk-stacks Feb 26, 2026 •

edited

Loading