fix(l1): retry GetPooledTransactions on alternate peer when blob sidecar mismatches#6691
fix(l1): retry GetPooledTransactions on alternate peer when blob sidecar mismatches#6691edg-l wants to merge 1 commit into
Conversation
|
Lines of code reportTotal lines added: Detailed view |
🤖 Codex Code Review
Static review only; I couldn’t run the targeted tests because Automated review by OpenAI Codex · gpt-5.4 · custom prompt |
🤖 Claude Code ReviewNow I have the full picture. Let me write the review. PR Review: fix(l1): retry GetPooledTransactions on alternate peer when blob sidecar mismatchesSummaryThis PR correctly addresses a real gap: when peer A's blob tx response fails validation, there was no mechanism to retry against peer B who had announced the same hash. The approach — a per-hash alternates queue in the mempool plus retry plumbing in the connection server — is well-scoped and the implementation is generally sound. CorrectnessTiming of The most important invariant to verify: For teardown and Alternates registration skips pool hits, not just in-flight
if unknown_set.contains(hash) || inner.transaction_pool.contains_key(hash) {
continue;
}Hashes already accepted into the pool are correctly skipped — no alternate is registered for a transaction we already have. Good.
Correctly drops the alternates entry the moment a transaction lands in the pool, preventing stale retries. One edge case: partial batch acceptance — if a
The borrow of Potential Issues1. let to_request: Vec<H256> = match state
.blockchain
.mempool
.reserve_unknown_hashes(&msg.hashes, state.node.node_id())When a 3rd peer later announces the same hashes while peer B has them in-flight, that 3rd peer is correctly registered as an alternate for peer B's request. This produces a natural retry chain up to depth 8. No issue — just confirming the cascade is intentional. 2. In let trimmed = announcement.filter_to(&alt_hashes);
conn.enqueue_tx_requests(trimmed, alt_hashes);Peer A's 3. Duplicated constant in tests ( /// Mirrors the private cap inside the mempool. Kept in sync manually.
const MAX_ALTERNATES_PER_HASH: usize = 8;A manual-sync comment on a constant that gates a correctness test is fragile. The
4. inner.alternates.retain(|_, (_, last_seen)| now.duration_since(*last_seen) < ttl);
Minor Observations
Test CoverageThe 7 unit tests are thorough and cover all the interesting cases: primary exclusion, FIFO ordering, capacity enforcement, deduplication, clearing, and TTL pruning. The only gap is the integration-level test for the Hive SummaryThe core logic is correct. The two actionable items before merging:
The Automated review by Claude (Anthropic) · sonnet · custom prompt |
Greptile SummaryThis PR fixes the Hive devp2p
Confidence Score: 3/5The primary bad-blob disconnect path is correctly ordered and fixes the Hive test; the teardown, sweep, and flush-failure branches have a call-order issue that can cause a retry to be silently lost in multi-threaded environments. The fix path that matters for the Hive test correctly clears in-flight before handing off to the alternate peer. However, the three secondary retry paths (teardown x2, stale-sweep, send-failure) call The call sites in
|
| Filename | Overview |
|---|---|
| crates/networking/p2p/rlpx/connection/server.rs | Adds retry_on_alternates helper and wires it into every failure path; the disconnect paths correctly call clear_in_flight_txs before the retry, but the teardown, sweep, and flush-failure paths call it after, creating a multi-threaded race where the retry can silently no-op while consuming the alternates entry. |
| crates/blockchain/mempool.rs | Adds alternates map with bounded per-hash queue (cap 8, FIFO eviction, dedup, timestamp-based pruning); logic is correct under the write lock and the fast path is unchanged for all-new-hash announcements. |
| crates/networking/p2p/peer_table.rs | Adds GetPeerConnection request handler — straightforward delegation to the existing peers map, only exercised on failure paths. |
| crates/networking/p2p/rlpx/eth/transactions.rs | Threads the announcer: H256 parameter through to reserve_unknown_hashes; mechanical change with no logic impact beyond the new parameter. |
| crates/networking/p2p/tx_broadcaster.rs | Piggybacks prune_alternates on the existing 6-minute prune tick; straightforward addition with no correctness concerns. |
| test/tests/blockchain/mempool_tests.rs | 7 new unit tests cover the key alternates properties (primary-vs-alternate, FIFO, cap, dedup, clear, prune); the only concern is a manually-mirrored MAX_ALTERNATES_PER_HASH constant. |
Sequence Diagram
sequenceDiagram
participant PeerA
participant ServerA as ConnectionServer(A)
participant Mempool
participant PeerTable
participant ServerB as ConnectionServer(B)
participant PeerB
PeerA->>ServerA: NewPooledTransactionHashes [h1,h2]
ServerA->>Mempool: reserve_unknown_hashes([h1,h2], A)
Mempool-->>ServerA: "[h1,h2] - A is primary"
PeerB->>ServerB: NewPooledTransactionHashes [h1,h2]
ServerB->>Mempool: reserve_unknown_hashes([h1,h2], B)
Mempool-->>ServerB: "[] - B recorded as alternate"
ServerA->>PeerA: GetPooledTransactions [h1,h2]
PeerA-->>ServerA: PooledTransactions (bad blob)
ServerA->>Mempool: clear_in_flight_txs([h1,h2])
ServerA->>Mempool: pop_alternates([h1,h2])
Mempool-->>ServerA: "peer B -> [h1,h2]"
ServerA->>PeerTable: get_peer_connection(B)
PeerTable-->>ServerA: PeerConnection(B)
ServerA->>ServerB: enqueue_tx_requests([h1,h2])
ServerA->>PeerA: Disconnect
ServerB->>Mempool: reserve_unknown_hashes([h1,h2], B)
Mempool-->>ServerB: "[h1,h2] re-marked in-flight under B"
ServerB->>PeerB: GetPooledTransactions [h1,h2]
PeerB-->>ServerB: PooledTransactions (valid)
ServerB->>Mempool: add_transaction + clear_alternates
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 3
crates/networking/p2p/rlpx/connection/server.rs:271-295
**`retry_on_alternates` called before `clear_in_flight_txs` in teardown (and sweep/flush paths)**
In all three non-disconnect paths — `teardown` (×2), `sweep_inflight_txs`, and the send-failure branch of `flush_pending_tx_requests` — `retry_on_alternates` is called before `clear_in_flight_txs`. In a multi-threaded tokio runtime the alternate actor's `handle_enqueue_tx_requests` can be scheduled between the mailbox enqueue (inside `retry_on_alternates`) and the subsequent `clear_in_flight_txs` call. When that race fires, the alternate's `reserve_unknown_hashes` finds the hashes still marked in-flight and returns empty, so the retry silently does nothing — but `pop_alternates` already consumed that alternates entry. The hashes end up not in-flight, not in the pool, and with depleted alternates, so no further retry is ever attempted.
The disconnect-on-bad-blob path (the primary fix) already has the correct order: `clear_in_flight_txs` is called at line ~1441 before `retry_on_alternates` at lines ~1460, ~1484, ~1513. Applying the same ordering here removes the race.
### Issue 2 of 3
crates/networking/p2p/rlpx/connection/server.rs:1719-1730
**Disconnected alternate peer silently wastes the alternates queue slot**
`pop_alternates` removes the first alternate per hash from the queue before we verify the peer is still reachable. When `peer_table.get_peer_connection(alt_peer_id)` returns `None` (peer disconnected between announcing and now), we `continue` to the next `alt_peer_id` in the outer `alts` map — but the hashes that were grouped under the now-missing `alt_peer_id` are simply dropped. Any remaining alternates still in the deque for those hashes exist but will never be tried in this call. The next retry attempt (if any even occurs) will start from the second remaining alternate, meaning one fallback slot is wasted per disconnected alternate peer.
### Issue 3 of 3
test/tests/blockchain/mempool_tests.rs:556-557
This constant duplicates the private `MAX_ALTERNATES_PER_HASH` in `mempool.rs` and is kept in sync manually, so a future change to the implementation cap won't be caught by the compiler. Consider exposing the constant from the mempool crate (e.g. `pub(crate)`) so the test can import it directly.
```suggestion
// Import the real constant so the test stays in sync automatically.
use ethrex_blockchain::mempool::MAX_ALTERNATES_PER_HASH;
```
Reviews (1): Last reviewed commit: "test(l1): move mempool alternates tests ..." | Re-trigger Greptile
| // Clear any in-flight transaction hashes so other connections can re-request them. | ||
| for (_, (_announced, requested_hashes, _)) in self.requested_pooled_txs.drain() { | ||
| // Try to re-issue each pending request to an alternate announcer before clearing. | ||
| for (_, (announced, requested_hashes, _)) in self.requested_pooled_txs.drain() { | ||
| retry_on_alternates( | ||
| &self.blockchain, | ||
| &self.peer_table, | ||
| &announced, | ||
| &requested_hashes, | ||
| ) | ||
| .await; | ||
| let _ = self | ||
| .blockchain | ||
| .mempool | ||
| .clear_in_flight_txs(&requested_hashes); | ||
| } | ||
| // Also clear hashes that were buffered but not yet sent. | ||
| for (_announced, pending_hashes) in self.pending_tx_requests.drain(..) { | ||
| for (announced, pending_hashes) in self.pending_tx_requests.drain(..) { | ||
| retry_on_alternates( | ||
| &self.blockchain, | ||
| &self.peer_table, | ||
| &announced, | ||
| &pending_hashes, | ||
| ) | ||
| .await; | ||
| let _ = self.blockchain.mempool.clear_in_flight_txs(&pending_hashes); |
There was a problem hiding this comment.
retry_on_alternates called before clear_in_flight_txs in teardown (and sweep/flush paths)
In all three non-disconnect paths — teardown (×2), sweep_inflight_txs, and the send-failure branch of flush_pending_tx_requests — retry_on_alternates is called before clear_in_flight_txs. In a multi-threaded tokio runtime the alternate actor's handle_enqueue_tx_requests can be scheduled between the mailbox enqueue (inside retry_on_alternates) and the subsequent clear_in_flight_txs call. When that race fires, the alternate's reserve_unknown_hashes finds the hashes still marked in-flight and returns empty, so the retry silently does nothing — but pop_alternates already consumed that alternates entry. The hashes end up not in-flight, not in the pool, and with depleted alternates, so no further retry is ever attempted.
The disconnect-on-bad-blob path (the primary fix) already has the correct order: clear_in_flight_txs is called at line ~1441 before retry_on_alternates at lines ~1460, ~1484, ~1513. Applying the same ordering here removes the race.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/networking/p2p/rlpx/connection/server.rs
Line: 271-295
Comment:
**`retry_on_alternates` called before `clear_in_flight_txs` in teardown (and sweep/flush paths)**
In all three non-disconnect paths — `teardown` (×2), `sweep_inflight_txs`, and the send-failure branch of `flush_pending_tx_requests` — `retry_on_alternates` is called before `clear_in_flight_txs`. In a multi-threaded tokio runtime the alternate actor's `handle_enqueue_tx_requests` can be scheduled between the mailbox enqueue (inside `retry_on_alternates`) and the subsequent `clear_in_flight_txs` call. When that race fires, the alternate's `reserve_unknown_hashes` finds the hashes still marked in-flight and returns empty, so the retry silently does nothing — but `pop_alternates` already consumed that alternates entry. The hashes end up not in-flight, not in the pool, and with depleted alternates, so no further retry is ever attempted.
The disconnect-on-bad-blob path (the primary fix) already has the correct order: `clear_in_flight_txs` is called at line ~1441 before `retry_on_alternates` at lines ~1460, ~1484, ~1513. Applying the same ordering here removes the race.
How can I resolve this? If you propose a fix, please make it concise.| let alts = match blockchain.mempool.pop_alternates(hashes) { | ||
| Ok(a) if !a.is_empty() => a, | ||
| _ => return, | ||
| }; | ||
| for (alt_peer_id, alt_hashes) in alts { | ||
| let trimmed = announcement.filter_to(&alt_hashes); | ||
| match peer_table.get_peer_connection(alt_peer_id).await { | ||
| Ok(Some(conn)) => { | ||
| let _ = conn.enqueue_tx_requests(trimmed, alt_hashes); | ||
| } | ||
| _ => continue, | ||
| } |
There was a problem hiding this comment.
Disconnected alternate peer silently wastes the alternates queue slot
pop_alternates removes the first alternate per hash from the queue before we verify the peer is still reachable. When peer_table.get_peer_connection(alt_peer_id) returns None (peer disconnected between announcing and now), we continue to the next alt_peer_id in the outer alts map — but the hashes that were grouped under the now-missing alt_peer_id are simply dropped. Any remaining alternates still in the deque for those hashes exist but will never be tried in this call. The next retry attempt (if any even occurs) will start from the second remaining alternate, meaning one fallback slot is wasted per disconnected alternate peer.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/networking/p2p/rlpx/connection/server.rs
Line: 1719-1730
Comment:
**Disconnected alternate peer silently wastes the alternates queue slot**
`pop_alternates` removes the first alternate per hash from the queue before we verify the peer is still reachable. When `peer_table.get_peer_connection(alt_peer_id)` returns `None` (peer disconnected between announcing and now), we `continue` to the next `alt_peer_id` in the outer `alts` map — but the hashes that were grouped under the now-missing `alt_peer_id` are simply dropped. Any remaining alternates still in the deque for those hashes exist but will never be tried in this call. The next retry attempt (if any even occurs) will start from the second remaining alternate, meaning one fallback slot is wasted per disconnected alternate peer.
How can I resolve this? If you propose a fix, please make it concise.| /// Mirrors the private cap inside the mempool. Kept in sync manually. | ||
| const MAX_ALTERNATES_PER_HASH: usize = 8; |
There was a problem hiding this comment.
This constant duplicates the private
MAX_ALTERNATES_PER_HASH in mempool.rs and is kept in sync manually, so a future change to the implementation cap won't be caught by the compiler. Consider exposing the constant from the mempool crate (e.g. pub(crate)) so the test can import it directly.
| /// Mirrors the private cap inside the mempool. Kept in sync manually. | |
| const MAX_ALTERNATES_PER_HASH: usize = 8; | |
| // Import the real constant so the test stays in sync automatically. | |
| use ethrex_blockchain::mempool::MAX_ALTERNATES_PER_HASH; |
Prompt To Fix With AI
This is a comment left during a code review.
Path: test/tests/blockchain/mempool_tests.rs
Line: 556-557
Comment:
This constant duplicates the private `MAX_ALTERNATES_PER_HASH` in `mempool.rs` and is kept in sync manually, so a future change to the implementation cap won't be caught by the compiler. Consider exposing the constant from the mempool crate (e.g. `pub(crate)`) so the test can import it directly.
```suggestion
// Import the real constant so the test stays in sync automatically.
use ethrex_blockchain::mempool::MAX_ALTERNATES_PER_HASH;
```
How can I resolve this? If you propose a fix, please make it concise.| /// Drop the entire alternates list for the given hashes. Called when a | ||
| /// transaction has been successfully accepted into the pool or when we | ||
| /// otherwise no longer need to retry. | ||
| pub fn clear_alternates(&self, hashes: &[H256]) -> Result<(), StoreError> { |
There was a problem hiding this comment.
clear_alternates is pub but has zero production callers. I grep'd crates/ and test/ for clear_alternates and the only hit outside this file is the unit test clear_alternates_drops_entries. Nothing calls it from the success path.
Consequence: on a healthy fetch (primary peer returns valid data, tx lands in the pool), every alternate announcement for that hash stays in the map until prune_alternates sweeps it via the 10-min TTL. For a busy node receiving the same blob tx announcement from many peers — exactly the workload this PR is targeting — that's potentially MAX_ALTERNATES_PER_HASH * <fan-in> entries lingering across the prune window.
The map self-bounds at 8 alternates × ~41 bytes per entry, so it's not a runaway leak (cap ≈ 3MB for 10k hashes). And reserve_unknown_hashes correctly avoids registering NEW alternates once the tx is in the pool (line 389: inner.transaction_pool.contains_key(hash)). But the existing alternates from before the pool insertion don't get dropped.
The natural wire-up is in the GetPooledTransactions response handler: after a successful add, call clear_alternates(&accepted_hashes). That keeps the alternates map tight on the happy path; the TTL stays as a safety net for hashes that never resolve.
If the intent is "rely on TTL only, the success-path cleanup isn't worth the lock cost", make clear_alternates pub(crate) and add a comment noting that the TTL is the only reaper. Either way the current state — a public unused method — is a fence we don't know which side of.
| break; | ||
| } | ||
| }; | ||
| match peer_table.get_peer_connection(alt.peer_id).await { |
There was a problem hiding this comment.
Double peer_table.get_peer_connection lookup with a race window. The popping loop here calls get_peer_connection(alt.peer_id) to verify the peer is alive (and skips dead ones), then ~30 lines down (line 1764) get_peer_connection(alt_peer_id) runs again to actually enqueue. Between the two lookups, the connection can close — the second lookup falls into Ok(None) and the enqueue_tx_requests is silently skipped. The alternate slot is already popped, so the hash retry just fizzles.
Not a correctness bug — the hash is still in flight nowhere, and the next failure path / teardown would pop the next alternate. But it burns a fallback slot for nothing.
Cleaner: pop+lookup in one shot and stash the PeerConnection handle in the by_peer map alongside the metadata. Then the enqueue site just iterates the handles. Re-grouping by alt_peer_id after the fact still needs a HashMap, but the value can be (PeerConnection, Vec<(H256, u8, usize)>) instead of re-looking-up. One round-trip per peer, no race.
Low priority; current code degrades gracefully.
|
|
||
| /// Maximum number of alternate announcers tracked per hash. Bounds the memory | ||
| /// used by the alternates map and prevents pathological peers from filling it. | ||
| pub const MAX_ALTERNATES_PER_HASH: usize = 8; |
There was a problem hiding this comment.
nit: MAX_ALTERNATES_PER_HASH = 8 is a hardcoded pub const (not configurable via BlockchainOptions or CLI). For most realistic deployments 8 is plenty — but it's the kind of value that's nice to bump from config when running Hive's adversarial-mempool scenarios or a high-fan-in benchmark.
Not urgent — the cap is conservative and the FIFO eviction is sound. Flagging only because the rest of the mempool's bounds are configurable (max_mempool_size, mempool_max_size, etc.) and this one stands out as the only hardcoded ceiling.
| reserve(&mp, &[tx], h(1)); | ||
| reserve(&mp, &[tx], h(2)); | ||
| // Sleep a tick so prune_alternates sees these entries as stale. | ||
| std::thread::sleep(Duration::from_millis(2)); |
There was a problem hiding this comment.
nit: std::thread::sleep(Duration::from_millis(2)) + prune_alternates(Duration::from_millis(1)) is a very tight margin (1ms TTL, 2ms sleep). On a busy CI runner where the scheduler hands a 2ms sleep back as 0.7ms, the test would flake by reporting entries-not-stale and the prune leaves them. Same shape as several CI flake reports we've eaten on other tests.
Bump to something like sleep(20ms) + ttl(5ms); the test still finishes in <50ms but tolerates ±10ms of CI jitter.
…car mismatches ## Bug When two peers announce the same blob tx hash, only the first becomes the in-flight requester; the second's announcement is dropped. If the first peer responds with bad data, ethrex disconnects it but has no fallback, and the upstream Hive test (`TestBlobTxWithMismatchedSidecar` / `TestBlobTxWithoutSidecar`) times out waiting for a re-request. A subtler variant: the two announcements can carry different (type, size) metadata. Peer A advertises a bare blob tx (small size), peer B advertises the full sidecar (large size). Even with a fallback in place, validating B's response against A's announced size would size-mismatch and disconnect the good peer. ## Fix Track *alternate* announcers per hash in the mempool, each carrying that peer's own (type, size). When a hash is already in-flight from peer A and peer B announces it, store the alternate keyed on B's metadata. On any failure path (bad blob, validation error, timeout, teardown, send error), pop the next live alternate and re-enqueue the request on that peer's connection, building the retry announcement from the alternate's own metadata so validation lines up. Bounded by MAX_ALTERNATES_PER_HASH = 8, FIFO eviction, stale entries pruned every 6 min with a 10 min TTL. ## Race / dead-peer hardening - `clear_in_flight_txs` runs *before* `retry_on_alternates` in every non-disconnect path (teardown, sweep, send-fail). Without this order the alternate's `reserve_unknown_hashes` finds the hash still in-flight and silently no-ops the retry, consuming a fallback slot for nothing. - `retry_on_alternates` walks per-hash and keeps popping when the chosen alternate has disconnected, so one dead peer doesn't burn a hash's only fallback slot. - Send-failure path rebuilds the announcement to cover every unsent hash (not just the current 256-chunk) so `filter_to` can recover type/size for hashes in later chunks. ## Test plan - 8 unit tests for the mempool alternates logic. - `cargo clippy -p ethrex-blockchain -p ethrex-p2p --all-targets -- -D warnings` - Hive devp2p: `TestBlobTxWithMismatchedSidecar` and `TestBlobTxWithoutSidecar` pass across 4 consecutive runs. ## Review feedback addressed - Race ordering across all four non-disconnect paths (greptile P1, Codex, ElFantasma). - Dead-peer slot waste in `pop_alternates` (greptile P2, Codex, ElFantasma). - Send-failure metadata loss for later chunks (Codex Medium). - Per-alternate (type, size) so retry against peer B isn't validated with peer A's metadata. - MAX_ALTERNATES_PER_HASH exposed for the test (greptile P2). - prune_alternates_drops_stale_entries timing loosened to tolerate CI jitter (ElFantasma). - `clear_alternates` kept `pub` with a comment noting it has no production caller yet — `add_transaction`'s inline per-hash cleanup + 10-min TTL prune is the current reaper. Wiring batch cleanup into the success path is a follow-up (ElFantasma). - TODO on MAX_ALTERNATES_PER_HASH = 8 noting follow-up for configurability (ilitteri +1 on ElFantasma).
c60d6b5 to
2b9f575
Compare
Bug
When two peers announce the same blob tx hash, only the first becomes the in-flight requester; the second's announcement is dropped. If the first peer responds with bad data, ethrex disconnects it but has no fallback, and the upstream Hive test (
TestBlobTxWithMismatchedSidecar/TestBlobTxWithoutSidecar) times out waiting for a re-request.A subtler variant of the same bug: the two announcements can carry different
(type, size)metadata. Peer A advertises a bare blob tx (small size), peer B advertises the full sidecar (large size). Even with a fallback in place, validating B's response against A's announced size would size-mismatch and disconnect the good peer.Fix
Track alternate announcers per hash in the mempool. When a hash is already in-flight from peer A and peer B announces it, store
Alternate { peer_id: B, tx_type, tx_size }keyed on B's own announcement. On any failure path (bad blob, validation error, timeout, teardown, send error), pop the next live alternate and re-enqueue the request on that peer's connection, building the retry announcement from the alternate's own metadata so validation lines up.Bounded by
MAX_ALTERNATES_PER_HASH = 8, FIFO eviction, stale entries pruned every 6 min with a 10 min TTL.Test plan
cargo clippy -p ethrex-blockchain -p ethrex-p2p --all-targets -- -D warningsTestBlobTxWithMismatchedSidecarandTestBlobTxWithoutSidecarpass across 4 consecutive runs.