Skip to content

refactor: msg pool to make more structured part 2#7006

Open
akaladarshi wants to merge 6 commits intomainfrom
akaladarshi/msgpool-refactor-phase2
Open

refactor: msg pool to make more structured part 2#7006
akaladarshi wants to merge 6 commits intomainfrom
akaladarshi/msgpool-refactor-phase2

Conversation

@akaladarshi
Copy link
Copy Markdown
Collaborator

@akaladarshi akaladarshi commented May 6, 2026

Summary of changes

Changes introduced in this pull request:

  • This PR is part 2 of restructuring of msg pool, it contains:

    • Local store which holds all the data related to the local state in the msgpool
    • Republish state holds structures for handling the republishing of the messages in the mempool
    • Created a cache struct which holds all the caches related to msgpool
  • This change should be applied on top of the refactor: msg pool to make more structured #6965

  • Next part will have major changes:

    • Use Arc on MessagePool itself rather than each individual field, this will allow us to:
      • Pass the message pool directly into the headchange trigger which will become part of the MessagePool, instead of being a free function with unlimited params
      • Convert republish_cycle part of the MessagePool, instead of being a free function with unlimited params

Reference issue to close (if applicable)

Part of #7010

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Outside contributions

  • I have read and agree to the CONTRIBUTING document.
  • I have read and agree to the AI Policy document. I understand that failure to comply with the guidelines will lead to rejection of the pull request.

Summary by CodeRabbit

  • Refactor
    • Consolidated multiple internal caches into a single cache container and simplified republishing coordination across the message pool.
  • New Features
    • Added a dedicated republish state and trigger mechanism to track and drive pending-message republishing more reliably.
  • Tests
    • Updated tests to align with the new cache and republish state behavior.

@akaladarshi akaladarshi requested a review from a team as a code owner May 6, 2026 07:19
@akaladarshi akaladarshi requested review from hanabi1224 and sudo-shashank and removed request for a team May 6, 2026 07:19
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 6, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 4eae1e94-53b9-4c0c-9b54-0495e20f82af

📥 Commits

Reviewing files that changed from the base of the PR and between e90ee4c and 92e806c.

📒 Files selected for processing (1)
  • src/message_pool/msgpool/republish.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/message_pool/msgpool/republish.rs

Walkthrough

Introduces RepublishState for republish coordination and Caches to centralize LRU caches; refactors MessagePool to use these, makes head_change synchronous and consult RepublishState, updates background tasks and tests to the new APIs.

Changes

Message Pool State Consolidation

Layer / File(s) Summary
New Data Structures
src/message_pool/msgpool/republish.rs, src/message_pool/msgpool/msg_pool.rs
Adds RepublishState (CID tracking + bounded trigger) and Caches struct bundling bls_sig, sig_val, key, state_nonce caches with constructor.
Module Declarations
src/message_pool/msgpool/mod.rs
Adds republish submodule and updates imports to expose StateNonceCacheKey and RepublishState.
MessagePool Struct
src/message_pool/msgpool/msg_pool.rs
Replaces standalone cache fields and ad-hoc republish/local-message fields with caches: Caches, republish: Arc<RepublishState>, and local_addrs: Arc<SyncRwLock<HashSet<Address>>>; removes load_local.
Constructor and Task Spawning
src/message_pool/msgpool/msg_pool.rs
MessagePool::new initializes Caches and RepublishState; spawned head-change and republish tasks capture and use the new shared state fields.
Method Integration
src/message_pool/msgpool/*.rs
All internal methods updated to use self.caches.* and new republish parameter(s): resolve_to_key, verify_msg_sig, add_helper, get_state_sequence, messages_for_blocks, apply_head_change, get_pending_messages.
Republish Coordination Logic
src/message_pool/msgpool/mod.rs
republish_pending_messages iterates local_addrs to build candidate set and calls republish.replace_with(...); head_change switched to sync, checks republish.was_republished() and calls republish.trigger()? when needed.
Tests and Helpers
src/message_pool/msgpool/republish.rs, src/message_pool/msgpool/mod.rs
Unit tests added for RepublishState; test helper create_fake_smsg updated to push into pool.caches.sig_val.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • LesnyRumcajs
  • sudo-shashank
  • hanabi1224
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'refactor: msg pool to make more structured part 2' partially describes the changeset. While the PR does involve message pool refactoring, the title is vague and generic—it uses the non-descriptive phrase 'make more structured' without specifying what actual structural changes were made. Replace the vague phrase with specific structural changes: e.g., 'refactor: consolidate msgpool caches and republish state' or 'refactor: introduce Caches and RepublishState structs in msgpool'.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch akaladarshi/msgpool-refactor-phase2
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch akaladarshi/msgpool-refactor-phase2

Comment @coderabbitai help to get the list of available commands and usage tips.

@akaladarshi akaladarshi requested a review from LesnyRumcajs May 6, 2026 07:19
Base automatically changed from akaladarshi/msgpool-refactor to main May 6, 2026 15:47
@akaladarshi akaladarshi force-pushed the akaladarshi/msgpool-refactor-phase2 branch from dd74a63 to a86d8c4 Compare May 7, 2026 07:15
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/message_pool/msgpool/local_store.rs`:
- Around line 26-28: The add() method currently unconditionally appends
resolved_from to self.local_addrs causing duplicates; change add() to avoid
duplicates by inserting only if the address is not already present (e.g., check
self.local_addrs.read().contains(&resolved_from) or convert local_addrs to a
HashSet and insert), so known_local_addrs() no longer grows per-message and
republish_pending_messages() won't re-resolve the same sender repeatedly; update
any code that assumes a Vec to handle the new container if you switch to
HashSet.

In `@src/message_pool/msgpool/mod.rs`:
- Around line 275-284: The republish trigger is using
RepublishState::mark_republished (which inserts) causing the logic to wake on
new CIDs instead of on CIDs already republished; change the check to a read-only
membership test by calling a new or existing RepublishState::was_republished
(implement it to return republished.contains(cid) without mutating state) and
use that in both loops (the branches around mpool_ctx.remove_from_selected_msgs
and the repub flag) so you only set repub = true when the CID was already in the
republished set.

In `@src/message_pool/msgpool/msg_pool.rs`:
- Around line 485-493: The load_local() implementation iterates
LocalStore::snapshot_msgs() (a HashSet) in non-deterministic order which causes
add() to fail with sequencing errors (SequenceTooLow, NonceGap,
DuplicateSequence) and may silently drop messages; fix by collecting
snapshot_msgs() into a vector, sort it deterministically by sender and
message().sequence before iterating, then call self.add(...) for each; update
the add() error handling in the closure used in load_local() so SequenceTooLow
still triggers local.remove_msg(&k) but other errors are either logged/warned
(including error kind) and left in local_msgs (or retried) rather than silently
ignored, referencing load_local, LocalStore::snapshot_msgs, add,
local.remove_msg, and the Error variants
SequenceTooLow/NonceGap/DuplicateSequence.

In `@src/message_pool/msgpool/republish.rs`:
- Around line 39-44: The trigger() method currently uses
self.trigger.send_async(()).await which can await and block head_change() when
the 4-slot wakeup buffer is full; replace the await send with a non-blocking
self.trigger.try_send(()) and treat a Full error as a no-op (return Ok(()))
because a full buffer already indicates a pending wake, while mapping other
errors into Error::Other with the error details. Keep the function signature and
callers (head_change(), republish_pending_messages()) unchanged; only change
send_async() -> try_send() and handle TrySendError::Full by dropping the signal
and returning Ok(()) while converting other TrySendError variants into the
existing Error::Other format.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 36a9a2fb-5de4-4a0e-85d3-0e9d2a8e6759

📥 Commits

Reviewing files that changed from the base of the PR and between 868b7c2 and a86d8c4.

📒 Files selected for processing (5)
  • src/message_pool/msgpool/local_store.rs
  • src/message_pool/msgpool/mod.rs
  • src/message_pool/msgpool/msg_pool.rs
  • src/message_pool/msgpool/republish.rs
  • src/message_pool/msgpool/selection.rs

Comment thread src/message_pool/msgpool/local_store.rs Outdated
Comment thread src/message_pool/msgpool/mod.rs
Comment thread src/message_pool/msgpool/msg_pool.rs Outdated
Comment thread src/message_pool/msgpool/republish.rs Outdated
@akaladarshi akaladarshi marked this pull request as draft May 7, 2026 07:29
@akaladarshi akaladarshi added the RPC requires calibnet RPC checks to run on CI label May 7, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 7, 2026

Codecov Report

❌ Patch coverage is 94.59459% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.24%. Comparing base (868b7c2) to head (92e806c).
⚠️ Report is 6 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/message_pool/msgpool/mod.rs 78.57% 2 Missing and 1 partial ⚠️
src/message_pool/msgpool/msg_pool.rs 94.87% 1 Missing and 1 partial ⚠️
src/message_pool/msgpool/republish.rs 98.21% 0 Missing and 1 partial ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
src/message_pool/msgpool/selection.rs 86.82% <100.00%> (ø)
src/message_pool/msgpool/republish.rs 98.21% <98.21%> (ø)
src/message_pool/msgpool/msg_pool.rs 88.36% <94.87%> (+0.72%) ⬆️
src/message_pool/msgpool/mod.rs 91.17% <78.57%> (-0.13%) ⬇️

... and 25 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 868b7c2...92e806c. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@akaladarshi akaladarshi marked this pull request as ready for review May 8, 2026 15:36
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/message_pool/msgpool/republish.rs (1)

38-42: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

trigger() should treat a full channel as success, not an error.

The current implementation maps all try_send errors (including Full) to an error. However, a full buffer simply means the republish task is already scheduled to wake—the signal should be dropped silently rather than failing head_change().

Proposed fix
     pub(in crate::message_pool) fn trigger(&self) -> Result<(), Error> {
-        self.trigger
-            .try_send(())
-            .map_err(|e| Error::Other(format!("Republish receiver dropped: {e}")))
+        match self.trigger.try_send(()) {
+            Ok(()) | Err(flume::TrySendError::Full(())) => Ok(()),
+            Err(flume::TrySendError::Disconnected(())) => {
+                Err(Error::Other("Republish receiver dropped".to_owned()))
+            }
+        }
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/message_pool/msgpool/republish.rs` around lines 38 - 42, The trigger()
method currently maps all try_send errors to Error::Other; change it to treat a
Full error as success (drop the signal silently) and only return an Err for
Disconnected (or other non-Full) failures. Locate the pub(in
crate::message_pool) fn trigger(&self) and adjust the try_send error handling so
that match/if distinguishes tokio::sync::mpsc::error::TrySendError::Full =>
Ok(()) and returns Error::Other(...) for the disconnected case, ensuring callers
like head_change() no longer fail when the channel buffer is full.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@src/message_pool/msgpool/republish.rs`:
- Around line 38-42: The trigger() method currently maps all try_send errors to
Error::Other; change it to treat a Full error as success (drop the signal
silently) and only return an Err for Disconnected (or other non-Full) failures.
Locate the pub(in crate::message_pool) fn trigger(&self) and adjust the try_send
error handling so that match/if distinguishes
tokio::sync::mpsc::error::TrySendError::Full => Ok(()) and returns
Error::Other(...) for the disconnected case, ensuring callers like head_change()
no longer fail when the channel buffer is full.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 6e5f7773-6a80-40ca-857e-c3ffe9e85a90

📥 Commits

Reviewing files that changed from the base of the PR and between a86d8c4 and e90ee4c.

📒 Files selected for processing (3)
  • src/message_pool/msgpool/mod.rs
  • src/message_pool/msgpool/msg_pool.rs
  • src/message_pool/msgpool/republish.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

RPC requires calibnet RPC checks to run on CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant