Skip to content

PLA-471: Multicall3 aggregation for eth_call batches#4

Closed
0x666c6f wants to merge 54 commits into
mainfrom
feature/pla-471-multicall3-aggregation-for-eth_call-batches
Closed

PLA-471: Multicall3 aggregation for eth_call batches#4
0x666c6f wants to merge 54 commits into
mainfrom
feature/pla-471-multicall3-aggregation-for-eth_call-batches

Conversation

@0x666c6f
Copy link
Copy Markdown
Collaborator

Summary

  • Aggregate batched eth_call requests into a single Multicall3 contract call for reduced latency
  • Add multicall3Aggregation config option to enable/disable per network (default: enabled)
  • Automatic fallback to individual requests if aggregation fails
  • Documentation for the new feature

Changes

  • architecture/evm/multicall3.go: Multicall3 encoding/decoding helpers
  • common/config.go: Add Multicall3Aggregation config field
  • erpc/http_server.go: Config check before batch aggregation
  • docs/pages/operation/batch.mdx: Feature documentation

Configuration

projects:
  - id: main
    networks:
      - architecture: evm
        evm:
          chainId: 1
          multicall3Aggregation: false  # disable for this network

🤖 Generated with Claude Code

@0x666c6f 0x666c6f self-assigned this Jan 14, 2026
@0x666c6f 0x666c6f requested a review from Copilot January 14, 2026 22:43
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements Multicall3 aggregation for batched eth_call requests on EVM networks to reduce latency. Multiple eth_call requests targeting the same block are combined into a single contract call, with automatic fallback to individual requests if aggregation fails.

Changes:

  • Added Multicall3 encoding/decoding helpers for aggregating eth_call batches
  • Introduced multicall3Aggregation configuration option (enabled by default)
  • Implemented batch detection and aggregation logic in HTTP server request handling

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
architecture/evm/multicall3.go Core Multicall3 ABI encoding/decoding functionality
architecture/evm/multicall3_test.go Comprehensive tests for Multicall3 encoding/decoding
common/config.go Added Multicall3Aggregation config field to EvmNetworkConfig
erpc/networks.go Context-based network rate limit skipping for aggregated requests
erpc/networks_skip_rate_limit_test.go Tests for rate limit skipping context functions
erpc/http_server.go Batch detection and config check before aggregation
erpc/http_batch_eth_call.go Main aggregation handler and fallback logic
erpc/http_batch_eth_call_detect_test.go Tests for batch detection logic
erpc/http_batch_eth_call_handle_test.go Tests for aggregation handler scenarios
erpc/http_batch_eth_call_forward_test.go Tests for fallback forwarding
erpc/http_batch_eth_call_helpers_test.go Test helpers and encoding utilities
erpc/http_server_batch_eth_call_test.go End-to-end integration tests
docs/pages/operation/batch.mdx Documentation for the new feature

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread architecture/evm/multicall3.go
@claude
Copy link
Copy Markdown

claude Bot commented Jan 14, 2026


Code review

No issues found. Checked for bugs and CLAUDE.md compliance.


Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 458c99de29

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread erpc/http_batch_eth_call.go Outdated
@0x666c6f 0x666c6f requested a review from Copilot January 15, 2026 13:57
@0x666c6f
Copy link
Copy Markdown
Collaborator Author

@claude

@claude
Copy link
Copy Markdown

claude Bot commented Jan 15, 2026

Claude finished @0x666c6f's task in 1m 56s —— View job


Code Review Complete ✓

I've completed a comprehensive review of this PR. Overall, this is a well-implemented feature with excellent test coverage and proper fallback mechanisms. Here are my findings:


✅ Strengths

  1. Robust ABI encoding/decoding - The Multicall3 implementation in architecture/evm/multicall3.go is thorough with proper bounds checking and overflow protection (safeUint64ToInt)

  2. Comprehensive test coverage - 453 lines of multicall3 tests + 275 lines of batch helpers + integration tests cover edge cases well

  3. Safe memory management - Proper use of mcResp.Release() throughout the fallback paths (lines 264, 291, 304, 314, 325, 334, 385)

  4. Graceful fallback logic - Multiple fallback scenarios handled:

    • Build failure (line 256 in http_batch_eth_call.go)
    • Request failure with appropriate error codes (line 266)
    • Response parsing failures (lines 284, 298, 309, 319, 329, 341)
  5. Security considerations - Proper validation:

    • Rejects calls with from, gas, or other parameters (line 153-159 in multicall3.go)
    • Bounds checking prevents buffer overflows
    • Safe integer conversions
  6. Good configuration design - Enabled by default with per-network opt-out capability


⚠️ Issues Found

1. Sequential Fallback Performance Regression (Severity: Medium)

Location: erpc/http_batch_eth_call.go:113-143 (forwardEthCallBatchCandidates)

Issue: The reviewer @chatgpt-codex-connector correctly identified that when multicall aggregation fails or is ineligible, requests are forwarded sequentially in a simple loop rather than in parallel. This is a regression from the original behavior.

Current code (sequential):

func (s *HttpServer) forwardEthCallBatchCandidates(...) {
    // ...
    for _, cand := range candidates {
        resp, err := forwardBatchProject(withSkipNetworkRateLimit(cand.ctx), project, network, cand.req)
        // handle response...
    }
}

Original behavior (parallel - lines 430-432 in http_server.go):

for i, reqBody := range requests {
    wg.Add(1)
    go func(index int, rawReq json.RawMessage, ...) {
        // handle request in goroutine
    }(i, reqBody, headers, queryArgs)
}

Impact:

  • Batches that can't use Multicall3 (e.g., calls with from field, mixed blocks, non-Multicall3 chains) will experience higher latency
  • Could increase timeout risk for large batches
  • Defeats the purpose of batch request parallelism

Recommendation:
Parallelize forwardEthCallBatchCandidates by spawning goroutines with a WaitGroup, matching the original parallel execution pattern. This maintains performance parity whether requests take the aggregation path or the fallback path.

2. Minor: Missing gosec comment justification

Location: architecture/evm/multicall3.go:315

Issue: The #nosec G115 comment exists but could be more descriptive about why the conversion is safe.

Current:

cur := uint64(offsetTableSize) // #nosec G115

Better:

// offsetTableSize is derived from len(calls) which is bounded by int,
// so this conversion to uint64 is safe (always non-negative).
cur := uint64(offsetTableSize) // #nosec G115

Note: Actually, looking at the code again, I see this exact comment does exist on line 313-314! This is not an issue. ✅


💡 Suggestions (Nice-to-haves)

  1. Metrics: Consider adding metrics for:

    • Number of requests aggregated via Multicall3
    • Fallback rate
    • Latency comparison (aggregated vs individual)
  2. Documentation enhancement: The docs are good, but could mention:

    • Which chains are known to NOT have Multicall3 deployed
    • Performance characteristics (e.g., "reduces latency by X%" based on testing)
  3. Rate limiting consideration: The code uses withSkipNetworkRateLimit for aggregated calls (line 260). Consider documenting this behavior - one aggregated call bypasses rate limits but represents N underlying calls.


🎯 Verdict

This PR implements a valuable performance optimization with:

  • ✅ Solid engineering practices
  • ✅ Excellent test coverage
  • ✅ Proper error handling and fallbacks
  • ⚠️ One performance regression that should be fixed

Recommendation: Address the sequential fallback issue before merging. The fix should be straightforward - add goroutines + WaitGroup to forwardEthCallBatchCandidates.


Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread architecture/evm/multicall3.go Outdated
Comment thread erpc/http_batch_eth_call.go Outdated
Comment thread erpc/http_batch_eth_call.go Outdated
@0x666c6f
Copy link
Copy Markdown
Collaborator Author

Comprehensive PR Review

High Priority Issues (2)

1. [HIGH] Silent Fallback Without Visibility

File: erpc/http_server.go:410-413

When detectEthCallBatchInfo fails, errors are only logged at Debug level. Users experience degraded performance (individual calls instead of aggregated) without any indication.

batchInfo, detectErr := detectEthCallBatchInfo(requests, architecture, chainId)
if detectErr != nil {
    lg.Debug().Err(detectErr).Msg("eth_call batch detection failed")
}

Recommendation: Add metrics and log at Info level for production visibility:

if detectErr != nil {
    lg.Info().Err(detectErr).
        Int("requestCount", len(requests)).
        Msg("eth_call batch detection failed, processing individually")
    // Consider adding: telemetry.MetricMulticall3FallbackTotal.WithLabelValues("detection_error", architecture, chainId).Inc()
}

2. [HIGH] Overly Broad Fallback Criteria

File: architecture/evm/multicall3.go:286-291

ShouldFallbackMulticall3 triggers fallback for ANY ErrCodeEndpointExecutionException, which includes errors that would also fail individual calls. This wastes resources by retrying calls that will fail anyway.

Recommendation: Be more specific - only fallback when multicall3 contract is unavailable (e.g., check for "contract not found" or "no code at address" in error message).


Medium Priority Issues (4)

3. [MEDIUM] Debug-Level Logging for All Fallback Scenarios

File: erpc/http_batch_eth_call.go (lines 251-258, 266-272, 280-286, etc.)

All fallback scenarios are logged at Debug level. In production, operators have no visibility into multicall3 health.

Recommendation: Add metrics for monitoring fallback frequency by reason.


4. [MEDIUM] Length Mismatch Should Trigger Alert

File: erpc/http_batch_eth_call.go:333-342

When len(decoded) != len(calls), this indicates potential data corruption but silently falls back.

Recommendation: Add a specific metric for this error case to enable alerting.


5. [MEDIUM] Error from nq.Method() is Ignored

File: erpc/http_batch_eth_call.go:182

method, _ := nq.Method()

Recommendation: Handle error explicitly instead of ignoring with _.


6. [MEDIUM] Missing Unit Test for safeUint64ToInt

File: architecture/evm/multicall3.go:26-33

This function is critical for preventing integer overflow attacks but has no direct unit test.


Low Priority Issues (3)

  • [LOW] RPC error details not logged when jrr.Error != nil (http_batch_eth_call.go:289-300)
  • [LOW] NewMulticall3Call returns generic error, losing specific failure context
  • [LOW] Missing unit test for isMulticall3AggregationEnabled

Documentation Suggestions

File: docs/pages/operation/batch.mdx

  1. Mention that requests must share the same network in addition to block tag
  2. Add note that aggregation requires at least 2 eligible eth_call requests

Verdict

APPROVED with suggestions - Implementation is solid and well-tested. High-priority issues are about observability, not correctness.

Comment thread erpc/http_server.go
Comment thread architecture/evm/multicall3.go
Comment thread erpc/http_batch_eth_call.go Outdated
Comment thread erpc/http_batch_eth_call.go
Comment thread erpc/http_batch_eth_call.go
Comment thread architecture/evm/multicall3.go
Comment thread erpc/http_batch_eth_call.go
@0x666c6f 0x666c6f requested a review from Copilot January 15, 2026 21:10
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread erpc/http_batch_eth_call.go Outdated
Comment on lines +497 to +510
"multicall3-cache-set",
fmt.Sprintf("network:%s", batchInfo.networkId),
common.ErrorFingerprint(rec),
).Inc()
lg.Error().
Interface("panic", rec).
Str("stack", string(debug.Stack())).
Msg("unexpected panic on multicall3 per-call cache-set")
}
}()
defer resp.RUnlock()
defer resp.DoneRef()

timeoutCtx, timeoutCtxCancel := context.WithTimeoutCause(network.AppCtx(), 10*time.Second, errors.New("cache driver timeout during multicall3 per-call set"))
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message 'cache driver timeout during multicall3 per-call set' should be 'cache driver timeout during multicall3 per-call cache write' or similar to be more precise about the operation.

Suggested change
"multicall3-cache-set",
fmt.Sprintf("network:%s", batchInfo.networkId),
common.ErrorFingerprint(rec),
).Inc()
lg.Error().
Interface("panic", rec).
Str("stack", string(debug.Stack())).
Msg("unexpected panic on multicall3 per-call cache-set")
}
}()
defer resp.RUnlock()
defer resp.DoneRef()
timeoutCtx, timeoutCtxCancel := context.WithTimeoutCause(network.AppCtx(), 10*time.Second, errors.New("cache driver timeout during multicall3 per-call set"))
"multicall3-cache-write",
fmt.Sprintf("network:%s", batchInfo.networkId),
common.ErrorFingerprint(rec),
).Inc()
lg.Error().
Interface("panic", rec).
Str("stack", string(debug.Stack())).
Msg("unexpected panic on multicall3 per-call cache write")
}
}()
defer resp.RUnlock()
defer resp.DoneRef()
timeoutCtx, timeoutCtxCancel := context.WithTimeoutCause(network.AppCtx(), 10*time.Second, errors.New("cache driver timeout during multicall3 per-call cache write"))

Copilot uses AI. Check for mistakes.
offsets := make([]uint64, len(calls))
// offsetTableSize is derived from len(calls) which is bounded by int,
// so this conversion to uint64 is safe (always non-negative).
cur := uint64(offsetTableSize) // #nosec G115
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment #nosec G115 suppresses a security warning but doesn't explain why the conversion is safe. Consider adding a brief explanation: // #nosec G115 - offsetTableSize is derived from len(calls) which is bounded by int, so conversion is safe

Copilot uses AI. Check for mistakes.
@0x666c6f
Copy link
Copy Markdown
Collaborator Author

@claude

@claude
Copy link
Copy Markdown

claude Bot commented Jan 15, 2026

Claude finished @0x666c6f's task in 2m 53s —— View job


PR Review: Multicall3 Aggregation for eth_call Batches

I've completed a thorough review of this PR. Overall, this is a well-architected feature with strong implementation quality. However, there are critical test failures that must be addressed before merging.


🚨 Critical Issues

1. Test Failure: Data Race in Parallel Fallback (BLOCKING)

Status: Two tests failing in CI (test run)

  • TestHandleEthCallBatchAggregation_FallbackPaths/build_error
  • TestHandleEthCallBatchAggregation_FallbackPaths/forward_error_fallback

Issue: The parallel fallback implementation (commit 7d8d272) introduced a data race. The test mock's projCalls++ counter at http_batch_eth_call_handle_test.go:256 is being written concurrently by multiple goroutines without synchronization.

Race condition details:

Read at 0x00c0010d6940 by goroutine 34003:
  TestHandleEthCallBatchAggregation_FallbackPaths.func10.2()
  http_batch_eth_call_handle_test.go:256

Previous write at 0x00c0010d6940 by goroutine 34002:
  TestHandleEthCallBatchAggregation_FallbackPaths.func10.2()
  http_batch_eth_call_handle_test.go:256

Root cause: When fixing the sequential fallback issue, the code was correctly changed to use goroutines (http_batch_eth_call.go:152), but the test mocks weren't updated to be thread-safe.

Fix required: Protect the counter with a mutex in the test:

var mu sync.Mutex
var projCalls int
// ...
func(ctx context.Context, project *PreparedProject, network *Network, req *common.NormalizedRequest) (*common.NormalizedResponse, error) {
    mu.Lock()
    projCalls++
    mu.Unlock()
    return fallbackResponse(t, req), nil
}

Fix this →


⚠️ Minor Issues from Copilot Reviews (Low Priority)

2. Error Message Clarity (erpc/http_batch_eth_call.go:510)

The error message "cache driver timeout during multicall3 per-call set" should be more descriptive:

// Current:
errors.New("cache driver timeout during multicall3 per-call set")

// Better:
errors.New("cache driver timeout during multicall3 per-call cache write")

Impact: Low - just improves error message clarity.

3. Security Comment Enhancement (architecture/evm/multicall3.go:361)

The #nosec G115 comment exists but could be slightly more detailed. However, checking the code, I see the comment on lines 359-360 already explains this:

// offsetTableSize is derived from len(calls) which is bounded by int,
// so this conversion to uint64 is safe (always non-negative).
cur := uint64(offsetTableSize) // #nosec G115

Status: This is actually already properly documented! The copilot comment is outdated. ✅


✅ Strengths

  1. Excellent fixes from previous reviews: All high/medium issues from the comprehensive review (comment from Jan 15) have been properly addressed:

    • ✅ Parallel fallback implemented (though test needs fixing)
    • ✅ Atomic counter for unique IDs
    • safeUint64ToInt test coverage added
    • ✅ Improved logging and metrics
    • ✅ Better fallback criteria in ShouldFallbackMulticall3
    • ✅ Proper error handling for nq.Method()
    • ✅ Rate limit documentation improved
  2. Robust ABI encoding/decoding: The Multicall3 implementation has excellent bounds checking and overflow protection throughout.

  3. Comprehensive test coverage:

    • 582 lines in multicall3_test.go
    • 506 lines in http_batch_eth_call_handle_test.go
    • Integration tests cover major scenarios
    • All edge cases are tested (except the data race issue)
  4. Safe memory management: Proper use of Release(), RUnlock(), and DoneRef() throughout fallback paths.

  5. Graceful degradation: Multiple fallback scenarios handled correctly with appropriate metrics.

  6. Well-documented: Clear comments explaining complex logic (ABI encoding, rate limiting behavior, etc.)


📊 Architecture Review

Configuration design: The new Multicall3AggregationConfig is well-designed:

  • Backward compatible (accepts boolean or full config)
  • Sensible defaults (enabled by default, 25ms window)
  • Proper validation in IsValid()
  • Per-network configuration support

Observability: Excellent metrics added:

  • MetricMulticall3AggregationTotal - tracks aggregation attempts
  • MetricMulticall3FallbackTotal - tracks fallback reasons
  • MetricMulticall3CacheHitsTotal - tracks cache efficiency

Performance considerations:

  • ✅ Parallel fallback prevents latency regression
  • ✅ Pre-aggregation cache check reduces unnecessary network calls
  • ✅ Network rate limits properly handled
  • ✅ Composite type marking prevents hedging overhead

🎯 Verdict

Status: ⚠️ CHANGES REQUESTED

This is excellent work with one blocking issue that must be fixed:

  1. MUST FIX: Data race in test (http_batch_eth_call_handle_test.go:256) - causes CI failure
  2. OPTIONAL: Minor error message improvement (line 510)

Once the test data race is fixed and tests pass, this PR will be ready to merge. The implementation itself is solid - the issue is purely in the test infrastructure.


📝 Summary

  • Implementation quality: Excellent ⭐⭐⭐⭐⭐
  • Test coverage: Comprehensive ⭐⭐⭐⭐⭐
  • Documentation: Well-documented ⭐⭐⭐⭐⭐
  • CI status: Failing due to test data race ❌

Next step: Fix the data race in the test mock by adding mutex protection for the counters.


@0x666c6f
Copy link
Copy Markdown
Collaborator Author

@claude @codex review

@claude
Copy link
Copy Markdown

claude Bot commented Jan 15, 2026

Claude finished @0x666c6f's task in 5m 55s —— View job


PR Review: Multicall3 Aggregation for eth_call Batches ✅

I've completed a comprehensive review of this PR. This is an excellent implementation of a complex performance optimization feature with outstanding engineering quality.


🎯 Executive Summary

Status:APPROVED - Ready to merge pending CI completion

This PR successfully implements Multicall3 aggregation for batched eth_call requests, which is a significant performance optimization. The implementation demonstrates:

  • Excellent code quality and architecture
  • Comprehensive test coverage
  • Proper error handling and fallback mechanisms
  • Good observability through metrics
  • All previous review feedback has been addressed

CI Status: Tests are currently running (in progress for 12+ minutes). Will update when complete.


✅ Strengths

1. Excellent Fix to Previous Issues

All critical issues from previous reviews have been properly addressed:

Data race fixed (erpc/http_batch_eth_call_handle_test.go:249-263): Mutex protection added for concurrent counter access

var mu sync.Mutex
projCalls := 0
netCalls := 0
// ...
mu.Lock()
netCalls++
mu.Unlock()

Parallel fallback implemented (erpc/http_batch_eth_call.go:148-169): Requests now fall back in parallel using goroutines + WaitGroup, preventing latency regression

Improved fallback criteria (architecture/evm/multicall3.go:302-337): ShouldFallbackMulticall3 now uses specific error patterns instead of overly broad matching. Only triggers fallback for contract unavailability, not generic execution errors.

Test coverage for safeUint64ToInt: Added proper overflow protection tests

Better logging and metrics: Info-level logging for fallbacks, comprehensive metrics tracking

2. Robust Architecture

Multicall3 Batching System (architecture/evm/multicall3_batcher.go):

  • Clean separation of concerns with Batcher, Batch, and BatchEntry types
  • Thread-safe batch management with proper mutex usage
  • Configurable limits (queue size, pending batches, max calls, calldata bytes)
  • Smart deduplication via call keys
  • Deadline-aware batching with safety margins

Configuration Design (common/config.go:1540-1667):

  • Backward compatible (accepts boolean or full config via UnmarshalYAML)
  • Sensible defaults (25ms window, 20 max calls, 64KB max calldata)
  • Comprehensive validation in IsValid()
  • Per-network configuration support
  • Rich configuration options: OnlyIfPending, AllowCrossUserBatching, AllowPendingTagBatching, etc.

Observability (telemetry/metrics.go):

  • MetricMulticall3AggregationTotal - tracks aggregation attempts by outcome
  • MetricMulticall3FallbackTotal - tracks fallback reasons (build_failed, forward_failed, length_mismatch, etc.)
  • MetricMulticall3CacheHitsTotal - tracks cache efficiency
  • MetricMulticall3QueueOverflowTotal - tracks backpressure

3. Comprehensive Test Coverage

Unit Tests:

  • architecture/evm/multicall3_test.go (582 lines) - ABI encoding/decoding
  • architecture/evm/multicall3_batcher_test.go (1505 lines) - Batching logic
  • architecture/evm/multicall3_manager_test.go (143 lines) - Manager lifecycle
  • erpc/http_batch_eth_call_handle_test.go (515 lines) - Handler scenarios
  • erpc/http_batch_eth_call_detect_test.go (121 lines) - Batch detection
  • erpc/http_batch_eth_call_forward_test.go (59 lines) - Fallback forwarding
  • common/config_test.go - Config parsing and validation

Integration Tests:

  • erpc/http_server_batch_eth_call_test.go (112 lines) - End-to-end flows
  • Cache hit/miss scenarios
  • Success and failure result handling
  • Fallback path testing

Edge Cases Covered:

  • Integer overflow protection (safeUint64ToInt)
  • Length mismatch detection
  • Concurrent access (data race prevention)
  • Rate limit handling
  • Cache integration
  • Mixed success/failure results

4. Security & Safety

Bounds checking: Prevents buffer overflows in ABI encoding/decoding
Integer overflow protection: safeUint64ToInt function with tests
Input validation: Rejects eth_call with from, gas, or other unsupported parameters
Safe memory management: Proper use of Release(), RUnlock(), DoneRef() throughout
Panic recovery: Async cache writes wrapped with panic recovery and metrics

5. Performance Optimizations

Pre-aggregation cache checking: Avoids unnecessary network calls for cached requests
Parallel fallback: Maintains performance when aggregation isn't applicable
Smart rate limiting: Network rate limit skipped for aggregated call (permits already acquired)
Composite type marking: Prevents hedging overhead for multicall3 requests
Async cache writes: Per-call cache writes don't block response path


⚠️ Minor Issues (Non-blocking)

1. Comment Typo (architecture/evm/multicall3.go:315)

There's a typo in a comment - extra / character:

/ NOTE: We intentionally do NOT include "execution reverted"

Should be:

// NOTE: We intentionally do NOT include "execution reverted"

2. Error Message Precision (erpc/http_batch_eth_call.go:510)

From copilot review - error message could be more descriptive:

errors.New("cache driver timeout during multicall3 per-call set")

Could be:

errors.New("cache driver timeout during multicall3 per-call cache write")

Note: These are cosmetic issues that don't affect functionality.


📊 Code Quality Assessment

Category Rating Notes
Architecture ⭐⭐⭐⭐⭐ Clean separation, well-designed abstractions
Test Coverage ⭐⭐⭐⭐⭐ Comprehensive unit + integration tests
Error Handling ⭐⭐⭐⭐⭐ Robust fallback mechanisms, proper error propagation
Documentation ⭐⭐⭐⭐⭐ Clear comments, design doc, user-facing docs
Observability ⭐⭐⭐⭐⭐ Excellent metrics coverage
Security ⭐⭐⭐⭐⭐ Proper validation, bounds checking, overflow protection
Performance ⭐⭐⭐⭐⭐ Smart optimizations, parallel execution

🔍 Verification of Previous Review Feedback

All issues from the Jan 15 comprehensive review have been addressed:

Issue Status Evidence
[HIGH] Silent fallback without visibility ✅ Fixed Info-level logging + metrics (http_batch_eth_call.go:374-378)
[HIGH] Overly broad fallback criteria ✅ Fixed Specific error patterns (multicall3.go:318-327)
[MEDIUM] Error from nq.Method() ignored ✅ Fixed Proper error handling (http_batch_eth_call.go:208-213)
[MEDIUM] Length mismatch logging ✅ Fixed Error-level logging + metrics (http_batch_eth_call.go:454-461)
[MEDIUM] Debug-level fallback logging ✅ Fixed Metrics added for all fallback scenarios
[MEDIUM] Missing safeUint64ToInt test ✅ Fixed Tests added in multicall3_test.go
Parallel fallback regression ✅ Fixed Goroutines + WaitGroup (http_batch_eth_call.go:148-169)
Timestamp ID collisions ✅ Fixed Atomic counter + timestamp (multicall3.go:24)
Rate limit documentation ✅ Fixed Comprehensive comments (http_batch_eth_call.go:126-131)

📈 Feature Highlights

Network-Level Batching

This implementation goes beyond simple JSON-RPC batch handling - it provides network-level batching across all entry points (HTTP single, HTTP batch, gRPC). This is a sophisticated optimization that:

  1. Reduces latency: Multiple eth_call requests aggregated into single network roundtrip
  2. Saves resources: Fewer upstream requests reduce costs and improve cache hit rates
  3. Maintains compatibility: Automatic fallback ensures no breaking changes
  4. Respects constraints: Configurable limits prevent abuse and ensure quality of service

Smart Configuration

The Multicall3AggregationConfig provides fine-grained control:

  • OnlyIfPending: Opt-in batching to avoid adding latency when no batch exists
  • AllowCrossUserBatching: Privacy control for multi-tenant environments
  • AllowPendingTagBatching: Safety control for non-finalized blocks
  • Per-network enablement: Disable for chains without Multicall3 deployment

🚦 CI Status Update

Test run: Still in progress (12+ minutes)

  • test workflow: in_progress
  • CodeQL: queued
  • Dependency Review: queued
  • Claude Code Review: success
  • ⚠️ Claude Code: action_required (this review)

Note: The long test duration is expected given the extensive test coverage (5000+ lines of test code). Will monitor for completion.


🎯 Recommendation

APPROVED

This PR is ready to merge once CI tests pass. The implementation quality is exceptional:

  1. ✅ All previous review feedback addressed
  2. ✅ Comprehensive test coverage
  3. ✅ Robust error handling and fallback mechanisms
  4. ✅ Excellent observability through metrics
  5. ✅ Well-documented code and user-facing documentation
  6. ✅ Security best practices followed
  7. ✅ Performance optimizations implemented correctly

The minor cosmetic issues noted above can be addressed in a follow-up PR if desired, but they don't block merging.

Great work on this complex feature! 🎉


Task List:

  • Read CLAUDE.md for repository guidelines
  • Check CI status for current test results
  • Review test files for data race issues
  • Review core implementation files
  • Check if previous review feedback has been addressed
  • Provide comprehensive review feedback

| View tests

@0x666c6f 0x666c6f requested a review from Copilot January 15, 2026 21:48
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

erpc/networks.go:1

  • The method name Cache() is inconsistent with the existing method CacheDal() at line 894, which returns the same field. Consider removing one of these methods to avoid confusion about which to use.
package erpc

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread common/config.go
c.MinWaitMs = 2
}
if c.SafetyMarginMs == 0 {
c.SafetyMarginMs = min(2, c.MinWaitMs)
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default for SafetyMarginMs is min(2, c.MinWaitMs), but when MinWaitMs is 0 (the default), this results in SafetyMarginMs = 0. This may not provide sufficient safety margin. Consider using max(2, min(2, c.MinWaitMs)) or documenting this edge case.

Suggested change
c.SafetyMarginMs = min(2, c.MinWaitMs)
c.SafetyMarginMs = max(2, min(2, c.MinWaitMs))

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 467cc9edc6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +340 to +342
// Build the multicall3 request
mcReq, _, err := BuildMulticall3Request(requests, batch.Key.BlockRef)
if err != nil {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve original block parameter when batching

The multicall request is built with batch.Key.BlockRef as the block parameter, but batch.Key.BlockRef is derived from ExtractCallInfo/NormalizeBlockParam, which normalizes hex block numbers to decimal strings and flattens EIP‑1898 block objects into plain strings. That means a request like "0x10" or {blockHash: ...} becomes "16"/"0x…" in the multicall call, which many JSON‑RPC servers reject because eth_call expects a hex quantity or a block-parameter object. When that happens the multicall call errors and ShouldFallbackMulticall3 won’t treat it as recoverable, so all batched requests fail even though the originals were valid. Consider carrying the original block parameter (or a hex‑normalized equivalent) into the multicall request instead of the normalized key string.

Useful? React with 👍 / 👎.

@0x666c6f 0x666c6f force-pushed the feature/pla-471-multicall3-aggregation-for-eth_call-batches branch from d651ad5 to 9a8e023 Compare January 16, 2026 09:51
@0x666c6f
Copy link
Copy Markdown
Collaborator Author

@claude @codex review

@0x666c6f 0x666c6f requested a review from Copilot January 16, 2026 09:54
@claude
Copy link
Copy Markdown

claude Bot commented Jan 16, 2026

Claude finished @0x666c6f's task in 3m 12s —— View job


PR Review: Multicall3 Aggregation for eth_call Batches ✅

I've completed a thorough review of this PR. This is an excellent implementation of a sophisticated network-level batching system with outstanding engineering quality.


🎯 Executive Summary

Status:APPROVED - High-quality implementation ready to merge

This PR implements Multicall3 aggregation for batched eth_call requests - a significant performance optimization that batches requests across all entrypoints (HTTP single, HTTP batch, gRPC), not just JSON-RPC batches. The implementation demonstrates exceptional engineering practices.

CI Status: Tests still in progress (15+ minutes) - will complete shortly

  • test workflow: in_progress
  • Claude Code Review: success
  • ⏳ CodeQL, Dependency Review: queued

Task List

  • Read CLAUDE.md for repository guidelines
  • Check CI status for current test results
  • Review core implementation files
  • Review test coverage
  • Check if previous review feedback has been addressed
  • Provide comprehensive review feedback

✅ Strengths

1. Exceptional Architecture Design

Network-Level Batching (architecture/evm/eth_call.go):

  • Batching happens at the network layer (pre-forward hook) rather than HTTP layer
  • Works across all entrypoints: HTTP single requests, HTTP batch requests, and gRPC
  • This is architecturally superior to HTTP-only batching

Clean Separation of Concerns:

  • multicall3.go: Pure ABI encoding/decoding logic (451 lines)
  • multicall3_batcher.go: Batching orchestration with deadline management (1024 lines)
  • multicall3_manager.go: Per-network batcher lifecycle (76 lines)
  • eth_call.go: Integration hook (169 lines)
  • http_batch_eth_call.go: HTTP batch handler fallback path (548 lines)

Smart Batching Key Design (multicall3_batcher.go:660-753):

  • Batches by: projectId, networkId, blockRef, directivesKey, and optionally userId
  • Prevents batching across incompatible requests
  • Supports user isolation for multi-tenant environments
  • Versioned directives key prevents cross-version conflicts

2. Comprehensive Test Coverage ⭐⭐⭐⭐⭐

8 dedicated test files with excellent coverage:

  • multicall3_test.go (582 lines) - ABI encoding/decoding
  • multicall3_batcher_test.go (1812 lines) - Batching logic including shutdown scenarios
  • multicall3_manager_test.go (143 lines) - Manager lifecycle
  • http_batch_eth_call_handle_test.go (515 lines) - HTTP handler scenarios
  • http_batch_eth_call_detect_test.go (121 lines) - Batch detection
  • http_batch_eth_call_forward_test.go (59 lines) - Fallback forwarding
  • http_batch_eth_call_helpers_test.go (294 lines) - Test helpers
  • http_server_batch_eth_call_test.go (112 lines) - Integration tests

Critical edge cases tested:

  • Integer overflow protection (TestSafeUint64ToInt)
  • Concurrent access with mutex protection
  • Shutdown and double-flush scenarios (multicall3_batcher_test.go)
  • Data race prevention (fixed from previous review)
  • Block parameter normalization
  • Cache hit/miss scenarios

3. Robust Error Handling and Fallback

Multiple fallback layers (http_batch_eth_call.go, multicall3_batcher.go):

  1. Build failure → fallback to individual requests (parallel)
  2. Forward failure (if recoverable) → fallback to individual requests
  3. Response parsing failure → fallback to individual requests
  4. Length mismatch (critical) → fallback with error-level logging
  5. Non-recoverable errors → propagate to all candidates

Smart fallback criteria (multicall3.go:298-337):

  • Only triggers fallback when multicall3 contract is unavailable
  • Checks for specific error patterns: "contract not found", "no code at address", etc.
  • Does NOT fallback for generic execution errors (which would also fail individually)
  • This prevents wasteful retry cycles

Parallel fallback (http_batch_eth_call.go:148-169):

  • Uses goroutines + WaitGroup for concurrent forwarding
  • Prevents latency regression when aggregation isn't applicable
  • Rate limits properly managed (already acquired during aggregation prep)

4. Excellent Observability ⭐⭐⭐⭐⭐

11 Prometheus metrics (telemetry/metrics.go:415-478):

  • MetricMulticall3AggregationTotal - tracks success/error/all_cached outcomes
  • MetricMulticall3FallbackTotal - tracks fallback reasons (9+ distinct labels)
  • MetricMulticall3CacheHitsTotal - cache efficiency
  • MetricMulticall3BatchSize - histogram of batch sizes
  • MetricMulticall3BatchWaitMs - histogram of wait times
  • MetricMulticall3QueueLen - current queue depth (gauge)
  • MetricMulticall3QueueOverflowTotal - backpressure tracking
  • MetricMulticall3DedupeTotal - deduplication stats
  • MetricMulticall3CacheWriteErrorsTotal - cache write failures
  • MetricMulticall3FallbackRequestsTotal - per-request fallback count
  • MetricMulticall3AbandonedTotal - abandoned requests (context canceled)

Comprehensive logging:

  • Info-level logging for fallback scenarios (visibility in production)
  • Error-level logging for critical issues (length mismatch)
  • Debug-level logging for bypass reasons (when logger available)
  • All logs include relevant context: networkId, candidateCount, reason

5. Security & Safety

Bounds checking (multicall3.go:197-296):

  • Prevents buffer overflows in ABI decoding
  • Validates offsets before array access: if base < 0 || base+32 > len(data)
  • Checks element counts against available data: if countInt > maxElements

Integer overflow protection:

  • safeUint64ToInt function (lines 30-37) with comprehensive tests
  • Used throughout decoding: offsets, counts, data lengths
  • Explicit overflow checks before multiplication: maxElements := (len(data) - offsetsStart) / 32

Input validation (multicall3.go:157-164):

  • Rejects eth_call with unsupported fields: from, gas, value, etc.
  • Only allows to, data, and input fields
  • Prevents parameter injection attacks

Safe memory management:

  • Proper use of Release(), RUnlock(), DoneRef() throughout
  • Panic recovery with metrics tracking (http_batch_eth_call.go:501-513)
  • Async cache writes don't block response path

Thread safety:

  • Mutexes protect concurrent access to batch state
  • Atomic counter for unique request IDs (line 24, 185-186)
  • Fixed data race from previous review (test mock uses mutex now)

6. Performance Optimizations

Pre-aggregation cache checking (http_batch_eth_call.go:281-323):

  • Checks cache before aggregation
  • Reduces unnecessary network calls
  • Respects skip-cache-read directive per request

Within-batch deduplication (multicall3_batcher.go:336-361):

  • Multiple identical requests share one multicall slot
  • Results fanned out to all waiters
  • Metrics track dedupe count

Smart rate limiting:

  • Project rate limits acquired per-request (billing/analytics accuracy)
  • Network rate limit acquired once for aggregated call
  • Skipped for fallback (permits already held)

Composite type marking (http_batch_eth_call.go:372):

  • Prevents hedging overhead for multicall3 requests
  • Avoids duplicate batch creation

Async cache writes (http_batch_eth_call.go:496-524):

  • Per-call cache writes don't block response
  • Proper context handling with timeout
  • Panic recovery with metrics

7. Configuration Design ⭐⭐⭐⭐⭐

Backward compatible (common/config.go:1652-1669):

  • Accepts both boolean and full config via UnmarshalYAML
  • Smooth migration path for existing configs

Sensible defaults (lines 1595-1624):

  • WindowMs: 25ms
  • MinWaitMs: 2ms
  • SafetyMarginMs: min(2, MinWaitMs)
  • MaxCalls: 20
  • MaxCalldataBytes: 64000
  • MaxQueueSize: 1000
  • MaxPendingBatches: 200

Rich configuration options:

  • OnlyIfPending: Opt-in batching (no latency unless batch exists)
  • AllowCrossUserBatching: Privacy control for multi-tenant
  • AllowPendingTagBatching: Safety control for non-finalized blocks
  • CachePerCall: Per-call cache writes toggle
  • Per-network enablement

Comprehensive validation (IsValid() at lines 1626-1649):

  • Validates ranges and relationships
  • Prevents invalid configurations

8. All Previous Review Feedback Addressed

From Jan 14-15 reviews, ALL issues fixed:

Issue Status Evidence
P2: Sequential fallback regression ✅ Fixed Parallel forwarding with goroutines (http_batch_eth_call.go:148-169)
[HIGH] Silent fallback without visibility ✅ Fixed Info-level logging + comprehensive metrics
[HIGH] Overly broad fallback criteria ✅ Fixed Specific error pattern matching (multicall3.go:318-327)
[MEDIUM] Error from nq.Method() ignored ✅ Fixed Proper error handling (http_batch_eth_call.go:208-213)
[MEDIUM] Length mismatch logging ✅ Fixed Error-level logging + specific metric (http_batch_eth_call.go:461-468)
[MEDIUM] Debug-level fallback logging ✅ Fixed Metrics for all scenarios
[MEDIUM] Missing safeUint64ToInt test ✅ Fixed Comprehensive tests (multicall3_test.go:15-66)
Timestamp ID collisions ✅ Fixed Atomic counter + timestamp (multicall3.go:24, 185-186)
Rate limit documentation ✅ Fixed Clear comments (http_batch_eth_call.go:126-131)
Data race in tests ✅ Fixed Mutex protection in test mocks

⚠️ Minor Observations (Non-blocking)

1. Codex P1 Comment - Block Parameter Handling (Already Addressed!)

The latest Codex review raised a concern about block parameter normalization. This has already been addressed in the implementation:

Concern: Block params are normalized to decimal (e.g., "0x10""16"), which many JSON-RPC servers reject.

Solution: blockParamForMulticall function (multicall3_batcher.go:932-953) converts decimal back to hex:

if isDecimalBlockRef(blockRef) {
    blockNum, err := strconv.ParseInt(blockRef, 10, 64)
    if err != nil {
        return nil, err
    }
    return fmt.Sprintf("0x%x", blockNum), nil  // ← Decimal to hex conversion
}

Additionally:

  • Block hashes (66 chars) are wrapped in EIP-1898 format: {"blockHash": "0x..."}
  • Hex block numbers are passed through unchanged
  • Tags (latest, finalized, etc.) are passed through unchanged

Verdict: This concern is not valid - the implementation correctly handles block parameters.

2. Copilot Comments (Cosmetic)

Comment 1 (erpc/http_batch_eth_call.go:510):
Error message could be slightly more descriptive:

// Current:
errors.New("cache driver timeout during multicall3 per-call set")
// Suggested:
errors.New("cache driver timeout during multicall3 per-call cache write")

Impact: Negligible - message is already clear.

Comment 2 (architecture/evm/multicall3.go:361):
Copilot suggested adding explanation to #nosec G115. However, this is already properly documented on lines 359-360:

// offsetTableSize is derived from len(calls) which is bounded by int,
// so this conversion to uint64 is safe (always non-negative).
cur := uint64(offsetTableSize) // #nosec G115

Verdict: No action needed - already well-documented.

Comment 3 (common/config.go:1604):
Suggested using max(2, min(2, c.MinWaitMs)) for SafetyMarginMs when MinWaitMs is 0. Current behavior results in SafetyMarginMs = 0 when MinWaitMs = 0 (the default).

Analysis: This is intentional - when MinWaitMs is 0, no minimum wait is enforced, so SafetyMarginMs = 0 is consistent. The default MinWaitMs is actually 2ms (line 1601), not 0.


📊 Code Quality Assessment

Category Rating Notes
Architecture ⭐⭐⭐⭐⭐ Network-level batching across all entrypoints - exceptional design
Test Coverage ⭐⭐⭐⭐⭐ 3600+ lines of tests, edge cases covered
Error Handling ⭐⭐⭐⭐⭐ Robust fallback mechanisms, smart retry logic
Documentation ⭐⭐⭐⭐⭐ Design doc, inline comments, user-facing docs
Observability ⭐⭐⭐⭐⭐ 11 metrics covering all scenarios
Security ⭐⭐⭐⭐⭐ Bounds checking, overflow protection, input validation
Performance ⭐⭐⭐⭐⭐ Pre-aggregation cache, deduplication, async writes
Config Design ⭐⭐⭐⭐⭐ Backward compatible, sensible defaults, rich options

🎯 Feature Highlights

Network-Level Batching (Not Just HTTP)

This implementation provides true network-level batching that:

  1. Works across all entry points: HTTP single requests, HTTP batch requests, gRPC
  2. Preserves network behaviors: caching, failover, circuit breakers, rate limiting
  3. Maintains billing accuracy: per-request metrics and rate limits
  4. Enables deduplication: identical requests share multicall slots
  5. Respects deadlines: deadline-aware batching prevents timeout issues

Smart Configuration

  • OnlyIfPending: No latency unless batch already exists (opt-in batching)
  • AllowCrossUserBatching: Privacy control for multi-tenant environments
  • AllowPendingTagBatching: Safety for non-finalized blocks
  • Per-network enablement: Disable for chains without Multicall3

Production-Ready

  • Comprehensive metrics for monitoring
  • Graceful fallback on failure
  • Panic recovery with tracking
  • Proper resource cleanup
  • Thread-safe implementation

📈 Statistics

  • Files changed: 41
  • Lines added: 12,119
  • Lines deleted: 65
  • Test lines: 3,600+
  • Commits: 35
  • Metrics added: 11
  • Test files: 8

🚦 CI Status

Test run: In progress (15+ minutes) - View tests

  • test: in_progress (expected to pass based on code quality)
  • CodeQL: queued
  • Dependency Review: queued
  • Claude Code Review: success
  • ⚠️ Claude Code: action_required (this review)

Note: Long test duration is expected given 3600+ lines of test code. Based on code review, tests should pass.


🎯 Recommendation

✅ APPROVED

This PR is ready to merge once CI tests complete. The implementation is exceptional:

  1. ✅ All previous review feedback comprehensively addressed
  2. ✅ Excellent architecture - network-level batching across all entrypoints
  3. ✅ Comprehensive test coverage with critical edge cases
  4. ✅ Robust error handling with smart fallback logic
  5. ✅ Outstanding observability (11 metrics, comprehensive logging)
  6. ✅ Security best practices (bounds checking, overflow protection)
  7. ✅ Performance optimizations (caching, deduplication, parallel fallback)
  8. ✅ Production-ready configuration with backward compatibility

The minor cosmetic observations noted above can be addressed in follow-up PRs if desired, but they don't block merging.

Excellent work on this sophisticated feature! 🎉 This is a significant performance optimization that will benefit all users with batched eth_call workloads.


View job

0x666c6f and others added 26 commits January 20, 2026 14:25
- Add MetricMulticall3AbandonedTotal metric for tracking context cancellation
- Add logging when batch results cannot be delivered due to context cancellation
- Release response when context is cancelled to prevent memory leaks
- Add build_failed label to fallback metrics when batch building fails
- Add forwarder nil validation in NewBatcher (panic on programming error)
- Add debug logging for enqueue errors in eth_call.go
- Add debug logging for cache get errors in http_batch_eth_call.go
- Update tests to use mock forwarder after nil validation was added

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add TestBatcher_ShutdownDuringActiveFlush: verifies that shutdown during
  an active flush delivers shutdown errors to pending entries
- Add TestBatcher_DoubleFlushPrevention: verifies that concurrent flush
  calls on the same batch don't result in double-processing (race test)

Both tests run with -race flag to catch any synchronization issues.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add allowedCallFields validation to reject unsupported fields (accessList, etc.)
  early in IsEligibleForBatching to avoid batcher failures
- Add blockParamForMulticall() to normalize block references:
  - Convert decimal block numbers to hex format
  - Pass through hex and tag values unchanged
- Add test TestBatcherFlush_UsesHexBlockParam to verify block param handling
- Add test case for unsupported call field rejection

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…tcher

Critical fixes:
- Add panic recovery with stack trace logging to scheduleFlush goroutine
- Fix inaccurate comment about map ordering (we iterate entries slice)

High priority fixes:
- Elevate cache/abandonment logging from Debug to Warn level
- Add stack trace logging to fallback panic recovery
- Use null byte separator in BatchingKey.String() to prevent collisions

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
P1 - Preserve block-hash params:
- Wrap 66-char hex strings (block hashes) in EIP-1898 format {blockHash: ...}
  to prevent misinterpretation as block numbers

P2 - Release multicall responses:
- Add mcResp.Release() after decoding and in error/fallback paths
  to prevent memory leaks from unreleased response buffers

P2 - Add cache reads for batched eth_call:
- Check cache before enqueueing requests into batcher
- Return cached responses directly, bypassing batching for cache hits
- Prevents performance regression when caching is enabled

P2 - Apply multicall3Aggregation defaults for TS configs:
- Add UnmarshalJSON method with bool backward compatibility
- Ensures {enabled: true} gets proper default values in TypeScript configs

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- P2: Mark calls with requireCanonical:false as ineligible for batching
  (EIP-1898 flag would be lost when rebuilding block param)
- P3: Apply default multicall3 config when not explicitly configured
  (enables batching by default as documented)
- Fix error message wording: "cache-set" → "cache write" for consistency
- Add test coverage for requireCanonical eligibility checks

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix timer channel drain in scheduleFlush: properly drain timer.C
  when Stop() returns false to prevent goroutine/memory leaks
- Add backpressure for cache write goroutines: use semaphore (100 max)
  to prevent unbounded goroutine growth under high load; skip cache
  write gracefully when at capacity
- CI already runs tests with -race flag via Makefile

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
P2: Avoid synthetic deadline for no-timeout batches
- Don't create synthetic deadline when context has no deadline
- Only update flush time based on deadline when request has one
- Prevents unnecessary timeouts on upstream multicall calls

P2: Reject mixed requireCanonical block-hash params in batch detection
- Track requireCanonical state across requests in a batch
- Reject batches with mixed values (some true/default, some false)
- Treat explicit true and absent (default true) as compatible
- Added test cases for requireCanonical handling

P3: Record validation errors in request span
- Pass actual error to EndRequestSpan instead of processed error body
- Ensures validation errors are properly recorded in tracing spans

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
CRITICAL fixes:
- Add panic recovery in forwardEthCallBatchCandidates goroutines
  to prevent server crash on unexpected panics
- Add MetricMulticall3CacheReadErrorsTotal for cache get errors
  to improve observability

HIGH priority fixes:
- Fix response release race condition by removing async release
  (was using `go resp.Release()`, now uses synchronous release)

Tests:
- Add TestNewBatcher_NilForwarder_Panics to verify panic on nil forwarder
- Add TestBatcher_ScheduleFlush_PanicRecovery to verify panic recovery

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Extract timer drain helper function (stopAndDrainTimer)
- Add BatchingKey.Validate() method for required field validation
- Add target address length validation (20 bytes)
- Change cache write log level from Debug to Warn
- Extract magic number 100 to maxConcurrentCacheWrites constant

Tests:
- Add TestBatcher_FallbackIndividual_PanicRecovery
- Add TestBatcher_MaxQueueSize_Enforcement
- Add TestBatchingKey_Validate
- Add TestBatcher_InvalidTargetLength_Bypass

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Tests assumed nil Multicall3Aggregation config would disable batching,
but the default config has Enabled: true. Now explicitly set Enabled: false.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Document the multicall3Aggregation config option for EVM networks,
including all available settings: enabled, windowMs, minWaitMs,
maxCalls, maxCalldataBytes, maxQueueSize, maxPendingBatches,
and allowCrossUserBatching.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add new unit tests for multicall3 batcher and HTTP batch handling:

- multicall3_batcher_test.go: Add tests for decode errors (nil response,
  nil JSON-RPC, JSON-RPC error, empty result), EIP-1898 block hash
  wrapping, context cancellation cleanup, successful delivery, cancelled
  context handling, and multiple deadline scheduling

- http_batch_eth_call_handle_test.go: Add panic recovery test for
  fallback forward path and empty params handling test

- http_batch_eth_call_forward_test.go: Add panic recovery and context
  cancellation tests for forward goroutines

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The test was using a non-atomic counter incremented by concurrent
goroutines, causing flaky test behavior depending on scheduling.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The Gosec scanner step was getting cancelled due to the 20-minute
timeout after tests completed successfully.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add missing config fields to user docs (safetyMarginMs, onlyIfPending,
  cachePerCall, allowPendingTagBatching) in networks.mdx
- Clarify Multicall3 scope in batch.mdx - works across ALL entrypoints
  (HTTP single, HTTP batch, gRPC), not just JSON-RPC batch requests
- Add metric for tight deadline bypass using MetricMulticall3QueueOverflowTotal
  with reason "deadline_too_tight"
- Remove duplicate CacheDal() method in networks.go (keep Cache() for interface)
- Use null byte separator in BatcherManager key to prevent collision risk
- Apply code formatting (make fmt)

Note: HIGH priority issue "Batch.mu exposed" was a false positive - the mu
field is already lowercase (unexported) and only accessible within the evm
package.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add explicit error when upstream group filter yields zero upstreams
  (prevents confusing "upstreams exhausted" error on misconfiguration)
- Add tracing attributes for post-filter upstream count and group
- Add MetricMulticall3PanicTotal metric for panic recovery tracking
  (recorded regardless of logger availability)
- Add MetricMulticall3CacheWriteDroppedTotal metric for cache write
  backpressure monitoring

Test coverage already exists for:
- Concurrent flush + enqueue: TestBatcherConcurrentFlush
- Cache write failures: TestBatcher_CacheWriteError_DoesNotFailRequest
- Panic recovery: TestBatcher_ScheduleFlush_PanicRecovery

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- When a UseUpstream directive is present, skip the matchUpstreamGroup
  filtering to allow targeting specific upstreams outside the group
- Add matchUpstreamGroup to TypeScript config types (FailsafeConfig)
- Add regression tests for group filter / UseUpstream interaction

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add integration test suite that validates multicall3 batching functionality
against a real eRPC instance. Tests cover:

Core functionality:
- JSON-RPC batch aggregation into single multicall3 call
- Concurrent request batching within time window
- Mixed batches (eth_call + other methods)
- Different contract addresses in same batch
- Large batches (20+ calls)
- Request deduplication

Bypass scenarios (calls that should NOT be batched):
- Calls with value/from/gas fields
- State overrides (3rd param)
- Calls to multicall3 contract itself (recursion guard)
- requireCanonical:false (EIP-1898)

Block reference variations:
- Block hash (EIP-1898 format)
- finalized/safe/earliest tags
- Specific block numbers

Input variations:
- 'input' field as alternative to 'data'
- Empty and large calldata

Directive tests:
- X-ERPC-Skip-Cache-Read
- X-ERPC-Retry-Empty
- X-ERPC-Use-Upstream

Also includes bash script for production testing.

Usage:
  ERPC_INTEGRATION_TEST_ENDPOINT=http://localhost:4000/main/evm/1 \
  ERPC_INTEGRATION_TEST_METRICS=http://localhost:4001/metrics \
  ERPC_INTEGRATION_TEST_AUTH="X-ERPC-Secret-Token: token" \
  go test -v ./test/integration/...

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Redundant with the Go integration tests which provide better coverage.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds Multicall3Aggregation to the list of EVM config fields that are
merged from networkDefaults when a network has its own evm section.
This allows disabling multicall3 globally via networkDefaults.evm.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add runtime detection for contracts that revert when called via Multicall3
but succeed when called individually (e.g., contracts checking msg.sender
code size like Chronicle Oracle).

Features:
- New `autoDetectBypass` config option (default: false)
- Runtime bypass cache with thread-safe access
- Auto-retry reverted calls individually, add to bypass if succeeds
- Bounded 30s timeout for retry when original context cancelled
- Validation for bypassContracts addresses (40 hex chars)

Observability:
- `erpc_multicall3_runtime_bypass_total` - contracts auto-detected
- `erpc_multicall3_auto_detect_retry_total` - retry outcomes
  (attempt/detected/same_error)

Includes comprehensive tests for:
- Auto-detect enabled/disabled scenarios
- Concurrent cache access (race-safe)
- Config validation for bypass addresses

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@0x666c6f 0x666c6f force-pushed the feature/pla-471-multicall3-aggregation-for-eth_call-batches branch from fbbba3c to 49184c3 Compare January 21, 2026 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants