Skip to content

fix: snapshot middleware callbacks before execution#283

Open
fallintoplace wants to merge 1 commit into
NVIDIA:mainfrom
fallintoplace:fix/middleware-callback-locks
Open

fix: snapshot middleware callbacks before execution#283
fallintoplace wants to merge 1 commit into
NVIDIA:mainfrom
fallintoplace:fix/middleware-callback-locks

Conversation

@fallintoplace

@fallintoplace fallintoplace commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Overview

Middleware callbacks now run from per-call snapshots instead of borrowed registry entries, so callbacks can safely re-enter registry or scope APIs without holding the global registry or active scope stack locks.

  • I confirm this contribution is my own work, or I have the right to submit it under this project's license.
  • I searched existing issues and open pull requests, and this does not duplicate existing work.

Details

  • Added snapshot helpers for tool and LLM sanitize guardrails and request intercepts.
  • Updated tool, non-streaming LLM, streaming LLM, and shared codec paths so they clone visible middleware while locked, then run callbacks after the locks are released.
  • Kept per-call snapshot semantics: callback registry mutations apply to later calls, not the in-progress chain.
  • Added lock-regression coverage for global and scope-local middleware across tool, LLM, and stream paths, including assertions that every registered callback executes.

Validation:

  • cargo test -p nemo-relay --test middleware_integration middleware_callbacks_run_without -- --nocapture
  • cargo test -p nemo-relay --test middleware_integration
  • cargo test -p nemo-relay
  • just test-rust
  • cargo fmt --all --check
  • git diff --check

Where should the reviewer start?

Start with crates/core/src/api/runtime/state.rs for the snapshot helpers, then crates/core/tests/integration/middleware_tests.rs for the lock-regression coverage.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • N/A

Summary by CodeRabbit

  • Refactor
    • Updated middleware guardrails and intercept handling for both tool and LLM flows to use a snapshot-based pipeline, ensuring consistent sanitization/interception and improving runtime efficiency.
    • Improved streaming END-event handling by separating snapshot preparation from final event construction and sanitization.
  • Tests
    • Added regression coverage to verify middleware lock-safety (no registry/scope write-lock held during callbacks) and to validate expected middleware callback execution order across tool and LLM pipelines (streaming and non-streaming).

@fallintoplace fallintoplace requested a review from a team as a code owner June 19, 2026 17:52
@copy-pr-bot

copy-pr-bot Bot commented Jun 19, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions Bot added size:L PR is large Bug issue describes bug; PR fixes bug lang:rust PR changes/introduces Rust code labels Jun 19, 2026
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Refactors all tool and LLM middleware pipelines by splitting each combined *_chain method into two phases: a *_entries function that snapshots ordered registry entries (releasing the lock), and a static *_snapshot_chain evaluator that applies them. All call sites in tool.rs, llm.rs, shared.rs, and stream.rs are updated accordingly. Lock-regression integration tests verify the invariant.

Changes

Snapshot-chain middleware refactor

Layer / File(s) Summary
New snapshot+chain APIs in runtime state
crates/core/src/api/runtime/state.rs
Removes tool_sanitize_request_chain, tool_sanitize_response_chain, tool_request_intercepts_chain, llm_sanitize_request_chain, llm_sanitize_response_chain, and llm_request_intercepts_chain. Replaces each with a *_entries function that merges global + scope-local registries into an owned Vec snapshot, and a static *_snapshot_chain evaluator that iterates the snapshot, invokes each middleware payload, propagates errors, and honors break_chain.
Tool call sites adopt snapshot-chain APIs
crates/core/src/api/tool.rs
tool_call, tool_call_end, tool_call_execute, and tool_request_intercepts restructured: entries and subscribers are snapshotted first (releasing the registry lock), then tool_sanitize_request_snapshot_chain, tool_sanitize_response_snapshot_chain, and tool_request_intercepts_snapshot_chain are called outside any lock. Return signatures updated to carry snapshotted subscribers separately.
LLM call sites adopt snapshot-chain APIs
crates/core/src/api/llm.rs, crates/core/src/api/shared.rs, crates/core/src/stream.rs
emit_llm_start, llm_call_end_with_behavior, llm_request_intercepts, run_request_intercepts_with_codec, and LlmStreamWrapper::emit_end_event each separated into an entries-snapshot phase and a snapshot-chain evaluation phase; event construction, annotated-request/response decoding, and estimated-cost attachment behavior preserved.
Lock-regression tests and unit test update
crates/core/tests/integration/middleware_tests.rs, crates/core/tests/unit/context_tests.rs
Adds assert_no_lock helper verifying neither the global registry write-lock nor the scope-stack write-lock is held during callbacks. Two new #[tokio::test] lock regression tests cover the full tool and LLM middleware stacks (global + scope-local guardrails, intercepts, execution, streaming). Unit test updated to use llm_sanitize_request_entries + llm_sanitize_request_snapshot_chain.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed Title follows Conventional Commits format with 'fix' type, concise imperative summary, no scope (optional), no breaking change indicator needed, 51 characters under the 72-character limit, and no trailing period.
Description check ✅ Passed Description includes all required template sections: Overview (with legal confirmation and duplication check), Details (explaining changes and validation approach), Where should the reviewer start (specific file recommendations), and Related Issues (marked N/A).
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/core/tests/integration/middleware_tests.rs`:
- Around line 1614-1725: Add tracking to verify that all registered callbacks
actually execute during the tool_call_execute test. Create a shared tracking
mechanism (such as a RefCell<Vec<String>> or similar) to record callback
execution labels. In each registered callback closure including
register_tool_conditional_execution_guardrail,
scope_register_tool_conditional_execution_guardrail,
register_tool_request_intercept, scope_register_tool_request_intercept,
register_tool_sanitize_request_guardrail,
scope_register_tool_sanitize_request_guardrail,
register_tool_execution_intercept, scope_register_tool_execution_intercept,
register_tool_sanitize_response_guardrail,
scope_register_tool_sanitize_response_guardrail, and the ToolExecutionNextFn
func closure, add code to record that the callback was invoked. After the
tool_call_execute completes, assert that the tracking list contains all expected
callback labels to ensure that no registered callbacks were dropped or skipped.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 42f49729-3d03-444b-9b5f-1500966629fb

📥 Commits

Reviewing files that changed from the base of the PR and between d5c2407 and d6550ad.

📒 Files selected for processing (7)
  • crates/core/src/api/llm.rs
  • crates/core/src/api/runtime/state.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/stream.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/tests/unit/context_tests.rs
📜 Review details
🧰 Additional context used
📓 Path-based instructions (17)
**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use snake_case naming convention for Rust identifiers (e.g., nemo_relay_tool_call)

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for all FFI work since it is Rust work
Run just test-rust to validate FFI changes
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting on FFI work

When Rust files changed as part of Go work, also run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all when Rust files are changed as part of Node work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files are changed as part of Node work
Run just test-rust when Rust files are changed as part of Node work

**/*.rs: Run cargo fmt --all to format all Rust code
Run cargo clippy --workspace --all-targets -- -D warnings to enforce all clippy lints as errors

**/*.rs: Run cargo fmt --all when Rust files changed as part of WebAssembly work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files changed as part of WebAssembly work

**/*.rs: If any Rust code changed, always run just test-rust
If any Rust code changed, also run cargo fmt --all
If any Rust code changed, also run cargo clippy --workspace --all-targets -- -D warnings
Run Rust formatting with cargo fmt --all
Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Use cargo fmt for Rust code formatting
Run cargo clippy -- -D warnings to lint Rust code and treat all warnings as errors
Use Rust snake_case naming convention for Rust identifiers
Include SPDX license header in all Rust source files using double-slash comment syntax
Validate Rust code with uv run pre-commit run --all-files to enforce cargo fmt formatting check, cargo clippy lints, and cargo deny aud...

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
{crates/adaptive/**/*.rs,**/*test*.{rs,py,go,ts,js},**/*adaptive*test*.{rs,py,go,ts,js},docs/plugins/adaptive/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Maintain documented and tested validation and report behavior for adaptive surfaces

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/tests/integration/middleware_tests.rs
**/{Cargo.toml,**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Maintain consistency between Rust package names in Cargo.toml and their actual usage across the codebase

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
**/*.{h,hpp,c,cpp,rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Ensure FFI header and library naming follows consistent conventions across platform-specific builds

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
{crates/core,crates/adaptive}/**/*

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changes to crates/core or crates/adaptive must run the full language matrix

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
**/*.{rs,toml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Rust crate names and module prefixes during coordinated rename operations

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
crates/core/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-go-binding/SKILL.md)

If the change touched crates/core or shared runtime semantics, also use validate-change for broader validation

crates/core/**/*.rs: Use Json = serde_json::Value in Rust-facing runtime APIs where the existing code expects JSON payloads.
Use Result<T> with FlowError in core runtime paths. Keep errors explicit and binding-appropriate at the wrapper layer.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
crates/{core,adaptive}/**

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

If crates/core or crates/adaptive changed, run the full matrix across Rust, Python, Go, Node.js, and WebAssembly

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
**/*.{rs,py,js,ts,tsx,jsx,go,sh,toml,yaml,yml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Follow binding naming conventions: Rust and Python use snake_case, C FFI exports prefixed nemo_relay_, Go uses PascalCase for public APIs, Node.js uses camelCase.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
crates/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

crates/**/*.rs: Keep async behavior on the existing tokio-based model. Bindings should preserve callback and future lifetimes rather than blocking or hiding async work unexpectedly.
Use Json = serde_json::Value in Rust-facing runtime APIs for JSON payload handling.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
crates/{core,adaptive}/**/*.rs

⚙️ CodeRabbit configuration file

crates/{core,adaptive}/**/*.rs: Review the Rust runtime for async correctness, scope isolation, middleware ordering, and event lifecycle regressions.
Pay close attention to task-local/thread-local scope propagation, callback lifetimes, stream finalization, and root_uuid isolation.
Public API changes should preserve existing behavior unless tests and docs show the intended migration path.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/tests/integration/middleware_tests.rs
**

⚙️ CodeRabbit configuration file

**:

AGENTS.md

This file provides guidance to agents, including Claude Code and OpenAI Codex, when working in this repository.

Project Overview

NeMo Relay is a multi-language agent runtime framework for execution scopes, lifecycle events, middleware, plugins, and observability around tool and LLM calls. The core runtime is Rust. Primary supported bindings are Rust, Python, and Node.js. Go, WebAssembly, and the raw C FFI are experimental and source-first.

The shared runtime model is:

  1. Scope stacks decide where work belongs and which scope-local behavior is visible.
  2. Middleware registries decide what guardrails and intercepts run around managed calls.
  3. Plugins install reusable runtime behavior from configuration.
  4. Events record runtime behavior in ATOF form.
  5. Subscribers and exporters consume events in-process or export them to ATIF, OpenTelemetry, OpenInference, or other backends.

Repository Structure

The repository layout separates the Rust runtime, language bindings, documentation,
integration patches, and agent-facing skills.

crates/
  core/       # Rust core runtime crate, published as nemo-relay
  adaptive/   # Adaptive runtime primitives and plugin components
  python/     # PyO3 native extension for the Python package
  ffi/        # Raw C ABI layer used by downstream bindings such as Go
  node/       # NAPI Node.js binding and JavaScript/TypeScript entry points
  wasm/       # wasm-bindgen WebAssembly binding and JS wrappers
python/
  nemo_relay/  # Python wrapper package: scopes, tools, LLM, middleware, typed helpers, plugins, adaptive helpers
  tests/      # Python tests
go/
  nemo_relay/  # Experimental Go CGo binding and tests
fern/         # Fern documentation site
scripts/      # Stable wrappers and helper scripts; build/test/docs entry points live in justfile
third_party/  # P...

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
crates/core/src/api/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Implement behavior first in Rust core API modules: crates/core/src/api/ and related core modules such as crates/core/src/api/runtime/, crates/core/src/codec/, or crates/core/src/json.rs

Files:

  • crates/core/src/api/shared.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
crates/core/src/api/{tool,llm}.rs

📄 CodeRabbit inference engine (.agents/skills/add-middleware/SKILL.md)

Wire the new middleware chain into the execute path in crates/core/src/api/tool.rs or crates/core/src/api/llm.rs at the appropriate pipeline stage

Files:

  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
crates/core/src/api/runtime/state.rs

📄 CodeRabbit inference engine (.agents/skills/add-middleware/SKILL.md)

crates/core/src/api/runtime/state.rs: Add registry fields as SortedRegistry<GuardrailEntry<T>> or SortedRegistry<Intercept<T>> to NemoRelayContextState in crates/core/src/api/runtime/state.rs
Add chain execution helpers to NemoRelayContextState following the pattern of existing methods like tool_sanitize_request_chain or tool_request_intercepts_chain

Files:

  • crates/core/src/api/runtime/state.rs
🔇 Additional comments (8)
crates/core/src/api/runtime/state.rs (2)

595-604: Confirm the required Rust/core validation matrix.

This touches crates/core, so please record the required validation in the PR: cargo fmt --all, cargo clippy --workspace --all-targets -- -D warnings, just test-rust, and the full affected binding matrix or a clear rationale for any skipped binding target.

As per coding guidelines, “Any Rust change must run just test-rust”, “Run Rust formatting with cargo fmt --all”, “Run cargo clippy --workspace --all-targets -- -D warnings”, and “Changes to crates/core or crates/adaptive must run the full language matrix”.

Source: Coding guidelines


604-633: LGTM!

Also applies to: 635-673, 763-811, 842-916, 1003-1058

crates/core/src/api/tool.rs (1)

217-256: LGTM!

Also applies to: 287-327, 434-449, 515-526

crates/core/src/api/llm.rs (1)

282-317: LGTM!

Also applies to: 426-484, 842-855

crates/core/src/api/shared.rs (1)

84-100: LGTM!

crates/core/src/stream.rs (1)

154-200: LGTM!

crates/core/tests/integration/middleware_tests.rs (1)

25-47: LGTM!

Also applies to: 95-107, 1727-1737, 1909-1920

crates/core/tests/unit/context_tests.rs (1)

282-284: LGTM!

Comment thread crates/core/tests/integration/middleware_tests.rs
@fallintoplace fallintoplace force-pushed the fix/middleware-callback-locks branch 3 times, most recently from da638c2 to 3b77121 Compare June 19, 2026 18:33

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
crates/core/src/stream.rs (1)

154-174: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Snapshot stream response sanitizers when the stream is created.

emit_end_event waits until finalization to snapshot llm_sanitize_response_entries, so stream execution intercepts, the collector, or the finalizer can register a response sanitizer that applies to the same in-flight stream. Capture the response-sanitizer entries/subscribers in LlmStreamWrapper::new or llm_stream_call_execute and store the per-stream snapshot on the wrapper.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/core/src/stream.rs` around lines 154 - 174, The response sanitizer
entries and subscribers are currently being snapshotted too late in the
emit_end_event method, allowing stream execution, interceptors, collectors, or
finalizers to register sanitizers that incorrectly apply to the same in-flight
stream. Move the snapshot logic that currently appears in this block (which
captures llm_sanitize_response_entries and event subscribers) to either
LlmStreamWrapper::new or llm_stream_call_execute where the stream is initially
created, and store the resulting snapshot (entries and subscribers tuple) as a
field on the wrapper struct so it can be reused in emit_end_event instead of
recapturing it at finalization time.
crates/core/src/api/llm.rs (1)

282-315: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Capture LLM sanitizer snapshots before request/execution callbacks.

emit_llm_start snapshots request sanitizers after request-intercept callbacks, and llm_call_end_with_behavior snapshots response sanitizers after provider/execution callbacks. Those callbacks can re-enter and register sanitizers that affect the same call, violating per-call snapshot semantics. Pass precomputed sanitizer entries from the top-level LLM execution snapshot into these helpers.

Also applies to: 426-442

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/core/src/api/llm.rs` around lines 282 - 315, The `emit_llm_start`
function is capturing LLM sanitizer snapshots after request-intercept callbacks
have already executed, which allows those callbacks to register new sanitizers
that affect the snapshot, violating per-call snapshot semantics. Refactor this
by capturing the sanitizer entries earlier before any callbacks execute (at the
top-level LLM execution point), and then pass the precomputed entries as a
parameter to `emit_llm_start` instead of having the function compute them
internally using `scope_guard.collect_scope_local_registries`. Apply the same
fix to `llm_call_end_with_behavior` to capture response sanitizer snapshots
before provider/execution callbacks rather than after.
crates/core/src/api/tool.rs (1)

389-449: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Freeze all tool middleware snapshots before the first callback.

tool_conditional_execution_snapshot_chain runs before request-intercept entries are captured, and tool_call/execution snapshots happen after request-intercept callbacks. A callback can re-enter and register a request intercept, sanitizer, or execution intercept that affects this same tool_call_execute, contradicting the stated per-call snapshot contract. Snapshot the visible entries/subscribers for every tool phase up front, or thread a per-call snapshot through tool_call/tool_call_end.

Also applies to: 451-472

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/core/src/api/tool.rs` around lines 389 - 449, The issue is that tool
middleware snapshots are being captured at different points in time, allowing
re-entrant callbacks to register new middleware that affects the same tool call.
Move the snapshot capture for intercept_entries (currently captured via
collect_scope_local_registries for tool_request_intercepts and
tool_request_intercept_entries) to occur before any snapshot chain methods are
called, specifically before tool_conditional_execution_snapshot_chain is
invoked. This ensures all entries, subscribers, and intercepts are captured
atomically upfront, preventing callbacks from modifying the middleware visible
to the current tool call.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/core/tests/integration/middleware_tests.rs`:
- Around line 1978-1980: The stream finalizer closure currently only records the
middleware callback via record_middleware_callback but is missing the lock-free
assertion that should be part of this tested code path. Add the same lock
assertion (that is likely being used elsewhere in the test) inside the finalizer
closure before returning the json response with stream true. This ensures the
test properly verifies that the stream finalizer callback operates without
acquiring locks, consistent with the expected behavior in the test labels.

---

Outside diff comments:
In `@crates/core/src/api/llm.rs`:
- Around line 282-315: The `emit_llm_start` function is capturing LLM sanitizer
snapshots after request-intercept callbacks have already executed, which allows
those callbacks to register new sanitizers that affect the snapshot, violating
per-call snapshot semantics. Refactor this by capturing the sanitizer entries
earlier before any callbacks execute (at the top-level LLM execution point), and
then pass the precomputed entries as a parameter to `emit_llm_start` instead of
having the function compute them internally using
`scope_guard.collect_scope_local_registries`. Apply the same fix to
`llm_call_end_with_behavior` to capture response sanitizer snapshots before
provider/execution callbacks rather than after.

In `@crates/core/src/api/tool.rs`:
- Around line 389-449: The issue is that tool middleware snapshots are being
captured at different points in time, allowing re-entrant callbacks to register
new middleware that affects the same tool call. Move the snapshot capture for
intercept_entries (currently captured via collect_scope_local_registries for
tool_request_intercepts and tool_request_intercept_entries) to occur before any
snapshot chain methods are called, specifically before
tool_conditional_execution_snapshot_chain is invoked. This ensures all entries,
subscribers, and intercepts are captured atomically upfront, preventing
callbacks from modifying the middleware visible to the current tool call.

In `@crates/core/src/stream.rs`:
- Around line 154-174: The response sanitizer entries and subscribers are
currently being snapshotted too late in the emit_end_event method, allowing
stream execution, interceptors, collectors, or finalizers to register sanitizers
that incorrectly apply to the same in-flight stream. Move the snapshot logic
that currently appears in this block (which captures
llm_sanitize_response_entries and event subscribers) to either
LlmStreamWrapper::new or llm_stream_call_execute where the stream is initially
created, and store the resulting snapshot (entries and subscribers tuple) as a
field on the wrapper struct so it can be reused in emit_end_event instead of
recapturing it at finalization time.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: b5b8b9f4-a865-4d6c-a841-9d4c2ea15e3c

📥 Commits

Reviewing files that changed from the base of the PR and between 3d24273 and da638c2.

📒 Files selected for processing (9)
  • crates/core/src/api/llm.rs
  • crates/core/src/api/runtime/state.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/stream.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/tests/unit/context_tests.rs
  • crates/ffi/src/api/mod.rs
  • crates/ffi/tests/unit/api/registry_tests.rs
📜 Review details
🧰 Additional context used
📓 Path-based instructions (21)
**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use snake_case naming convention for Rust identifiers (e.g., nemo_relay_tool_call)

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for all FFI work since it is Rust work
Run just test-rust to validate FFI changes
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting on FFI work

When Rust files changed as part of Go work, also run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all when Rust files are changed as part of Node work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files are changed as part of Node work
Run just test-rust when Rust files are changed as part of Node work

**/*.rs: Run cargo fmt --all to format all Rust code
Run cargo clippy --workspace --all-targets -- -D warnings to enforce all clippy lints as errors

**/*.rs: Run cargo fmt --all when Rust files changed as part of WebAssembly work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files changed as part of WebAssembly work

**/*.rs: If any Rust code changed, always run just test-rust
If any Rust code changed, also run cargo fmt --all
If any Rust code changed, also run cargo clippy --workspace --all-targets -- -D warnings
Run Rust formatting with cargo fmt --all
Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Use cargo fmt for Rust code formatting
Run cargo clippy -- -D warnings to lint Rust code and treat all warnings as errors
Use Rust snake_case naming convention for Rust identifiers
Include SPDX license header in all Rust source files using double-slash comment syntax
Validate Rust code with uv run pre-commit run --all-files to enforce cargo fmt formatting check, cargo clippy lints, and cargo deny aud...

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/ffi/src/api/mod.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/ffi/tests/unit/api/registry_tests.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/runtime/state.rs
{crates/adaptive/**/*.rs,**/*test*.{rs,py,go,ts,js},**/*adaptive*test*.{rs,py,go,ts,js},docs/plugins/adaptive/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Maintain documented and tested validation and report behavior for adaptive surfaces

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/ffi/tests/unit/api/registry_tests.rs
  • crates/core/tests/integration/middleware_tests.rs
**/{Cargo.toml,**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Maintain consistency between Rust package names in Cargo.toml and their actual usage across the codebase

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/ffi/src/api/mod.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/ffi/tests/unit/api/registry_tests.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/runtime/state.rs
**/*.{h,hpp,c,cpp,rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Ensure FFI header and library naming follows consistent conventions across platform-specific builds

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/ffi/src/api/mod.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/ffi/tests/unit/api/registry_tests.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/runtime/state.rs
{crates/core,crates/adaptive}/**/*

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changes to crates/core or crates/adaptive must run the full language matrix

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/runtime/state.rs
**/*.{rs,toml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Rust crate names and module prefixes during coordinated rename operations

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/ffi/src/api/mod.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/ffi/tests/unit/api/registry_tests.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/runtime/state.rs
crates/core/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-go-binding/SKILL.md)

If the change touched crates/core or shared runtime semantics, also use validate-change for broader validation

crates/core/**/*.rs: Use Json = serde_json::Value in Rust-facing runtime APIs where the existing code expects JSON payloads.
Use Result<T> with FlowError in core runtime paths. Keep errors explicit and binding-appropriate at the wrapper layer.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/runtime/state.rs
crates/{core,adaptive}/**

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

If crates/core or crates/adaptive changed, run the full matrix across Rust, Python, Go, Node.js, and WebAssembly

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/runtime/state.rs
**/*.{rs,py,js,ts,tsx,jsx,go,sh,toml,yaml,yml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/ffi/src/api/mod.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/ffi/tests/unit/api/registry_tests.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/runtime/state.rs
**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Follow binding naming conventions: Rust and Python use snake_case, C FFI exports prefixed nemo_relay_, Go uses PascalCase for public APIs, Node.js uses camelCase.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/ffi/src/api/mod.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/ffi/tests/unit/api/registry_tests.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/runtime/state.rs
crates/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

crates/**/*.rs: Keep async behavior on the existing tokio-based model. Bindings should preserve callback and future lifetimes rather than blocking or hiding async work unexpectedly.
Use Json = serde_json::Value in Rust-facing runtime APIs for JSON payload handling.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/ffi/src/api/mod.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/ffi/tests/unit/api/registry_tests.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/runtime/state.rs
crates/{core,adaptive}/**/*.rs

⚙️ CodeRabbit configuration file

crates/{core,adaptive}/**/*.rs: Review the Rust runtime for async correctness, scope isolation, middleware ordering, and event lifecycle regressions.
Pay close attention to task-local/thread-local scope propagation, callback lifetimes, stream finalization, and root_uuid isolation.
Public API changes should preserve existing behavior unless tests and docs show the intended migration path.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/runtime/state.rs
{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/ffi/tests/unit/api/registry_tests.rs
  • crates/core/tests/integration/middleware_tests.rs
**

⚙️ CodeRabbit configuration file

**:

AGENTS.md

This file provides guidance to agents, including Claude Code and OpenAI Codex, when working in this repository.

Project Overview

NeMo Relay is a multi-language agent runtime framework for execution scopes, lifecycle events, middleware, plugins, and observability around tool and LLM calls. The core runtime is Rust. Primary supported bindings are Rust, Python, and Node.js. Go, WebAssembly, and the raw C FFI are experimental and source-first.

The shared runtime model is:

  1. Scope stacks decide where work belongs and which scope-local behavior is visible.
  2. Middleware registries decide what guardrails and intercepts run around managed calls.
  3. Plugins install reusable runtime behavior from configuration.
  4. Events record runtime behavior in ATOF form.
  5. Subscribers and exporters consume events in-process or export them to ATIF, OpenTelemetry, OpenInference, or other backends.

Repository Structure

The repository layout separates the Rust runtime, language bindings, documentation,
integration patches, and agent-facing skills.

crates/
  core/       # Rust core runtime crate, published as nemo-relay
  adaptive/   # Adaptive runtime primitives and plugin components
  python/     # PyO3 native extension for the Python package
  ffi/        # Raw C ABI layer used by downstream bindings such as Go
  node/       # NAPI Node.js binding and JavaScript/TypeScript entry points
  wasm/       # wasm-bindgen WebAssembly binding and JS wrappers
python/
  nemo_relay/  # Python wrapper package: scopes, tools, LLM, middleware, typed helpers, plugins, adaptive helpers
  tests/      # Python tests
go/
  nemo_relay/  # Experimental Go CGo binding and tests
fern/         # Fern documentation site
scripts/      # Stable wrappers and helper scripts; build/test/docs entry points live in justfile
third_party/  # P...

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/ffi/src/api/mod.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/ffi/tests/unit/api/registry_tests.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/runtime/state.rs
crates/ffi/src/api/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

crates/ffi/src/api/**/*.rs: Add or update FFI wrappers in relevant crates/ffi/src/api/*.rs modules, re-export through crates/ffi/src/api/mod.rs, and ensure generated crates/ffi/nemo_relay.h stays correct
Use nemo_relay_ prefix for C FFI function names (e.g., nemo_relay_tool_call)

Files:

  • crates/ffi/src/api/mod.rs
crates/ffi/**

📄 CodeRabbit inference engine (.agents/skills/test-ffi-surface/SKILL.md)

Rebuild the FFI crate in release mode so the shared library and header stay in sync when making changes to crates/ffi

Files:

  • crates/ffi/src/api/mod.rs
  • crates/ffi/tests/unit/api/registry_tests.rs
crates/ffi/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-go-binding/SKILL.md)

If the change touched crates/ffi, also use test-ffi-surface for validation

Files:

  • crates/ffi/src/api/mod.rs
  • crates/ffi/tests/unit/api/registry_tests.rs
crates/{python,ffi,node,wasm}/**/*

⚙️ CodeRabbit configuration file

crates/{python,ffi,node,wasm}/**/*: Treat binding changes as public API changes. Check for parity with the other language bindings, FFI ownership/lifetime safety,
callback error propagation, stable type conversion, and consistent async/stream semantics.
Flag changes that update one binding without corresponding tests or documentation for the same surface elsewhere.

Files:

  • crates/ffi/src/api/mod.rs
  • crates/ffi/tests/unit/api/registry_tests.rs
crates/core/src/api/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Implement behavior first in Rust core API modules: crates/core/src/api/ and related core modules such as crates/core/src/api/runtime/, crates/core/src/codec/, or crates/core/src/json.rs

Files:

  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/runtime/state.rs
crates/core/src/api/{tool,llm}.rs

📄 CodeRabbit inference engine (.agents/skills/add-middleware/SKILL.md)

Wire the new middleware chain into the execute path in crates/core/src/api/tool.rs or crates/core/src/api/llm.rs at the appropriate pipeline stage

Files:

  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
crates/core/src/api/runtime/state.rs

📄 CodeRabbit inference engine (.agents/skills/add-middleware/SKILL.md)

crates/core/src/api/runtime/state.rs: Add registry fields as SortedRegistry<GuardrailEntry<T>> or SortedRegistry<Intercept<T>> to NemoRelayContextState in crates/core/src/api/runtime/state.rs
Add chain execution helpers to NemoRelayContextState following the pattern of existing methods like tool_sanitize_request_chain or tool_request_intercepts_chain

Files:

  • crates/core/src/api/runtime/state.rs
🔇 Additional comments (10)
crates/ffi/src/api/mod.rs (1)

126-131: LGTM!

Also applies to: 210-215

crates/ffi/tests/unit/api/registry_tests.rs (1)

324-348: LGTM!

Also applies to: 386-403

crates/core/src/api/runtime/state.rs (2)

595-604: Confirm the required Rust/core validation matrix ran.

This changes crates/core, so please attach results for cargo fmt --all, cargo clippy --workspace --all-targets -- -D warnings, just test-rust, uv run pre-commit run --all-files, and the affected binding matrix, or note any intentionally skipped checks.

As per coding guidelines, “Any Rust change must run just test-rust”, “Any Rust change must run cargo fmt --all”, “Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings”, “Validate Rust code with uv run pre-commit run --all-files”, and “Changes to crates/core or crates/adaptive must run the full language matrix.”

Source: Coding guidelines


604-633: LGTM!

Also applies to: 644-673, 772-811, 851-878, 889-916, 1012-1058

crates/core/src/api/tool.rs (1)

217-237: LGTM!

Also applies to: 287-312, 515-527

crates/core/src/api/llm.rs (1)

840-857: LGTM!

crates/core/src/api/shared.rs (1)

84-100: LGTM!

crates/core/tests/integration/middleware_tests.rs (2)

25-47: LGTM!

Also applies to: 95-122


1625-1762: The code already has #![allow(clippy::await_holding_lock)] at the module level.

The module-level allow directive at line 11 of crates/core/tests/integration/middleware_tests.rs suppresses the clippy lint across all tests in the file, including both flagged tests. This pattern is consistent across all core test files and is the standard approach for test serialization in async tests that require blocking mutex guards to prevent concurrent test interference. No additional allow or changes are needed.

			> Likely an incorrect or invalid review comment.
crates/core/tests/unit/context_tests.rs (1)

282-284: LGTM!

Comment thread crates/core/tests/integration/middleware_tests.rs
@fallintoplace fallintoplace force-pushed the fix/middleware-callback-locks branch from 3b77121 to e2c6f39 Compare June 19, 2026 18:40

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
crates/core/src/api/llm.rs (1)

282-299: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Take LLM call snapshots before request/execution callbacks can mutate registries.

emit_llm_start snapshots request sanitizers after request intercept callbacks have already run, and llm_call_end_with_behavior snapshots response sanitizers after provider/execution callbacks. Those callbacks can now mutate later-stage middleware and affect the same LLM call. Capture the LLM call’s sanitizer/intercept/execution/response snapshots in the managed call entrypoint before running callbacks, then pass them into start/end emission.

Also applies to: 426-442

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/core/src/api/llm.rs` around lines 282 - 299, The snapshot logic shown
for request sanitizers in the `emit_llm_start` function (and similar logic in
`llm_call_end_with_behavior` for response sanitizers) is being captured after
callbacks have already run, allowing those callbacks to mutate registries and
affect the snapshots. Move all snapshot collection (sanitizers, interceptors,
execution, and response handlers) to the managed call entrypoint before any
callbacks are executed, then pass these pre-captured snapshots as parameters
into `emit_llm_start` and `llm_call_end_with_behavior` instead of capturing them
internally within those functions.
crates/core/src/stream.rs (1)

149-165: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Snapshot stream response middleware before stream callbacks run.

The finalizer runs before lines 154-165 collect llm_sanitize_response entries, and collectors can run earlier during the same stream. Either callback can mutate response sanitizer registrations and change the current stream END event. Capture the response sanitizer/subscriber snapshot when the stream wrapper is created, or before any stream callback is invoked, and reuse it here.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/core/src/stream.rs` around lines 149 - 165, The snapshot of response
sanitizer registrations and subscribers is being captured after the finalizer
has already executed, but the finalizer and other stream callbacks can mutate
the response sanitizer registrations before this snapshot is taken. Instead of
capturing the snapshot at lines 154-165 where collect_scope_local_registries and
collect_scope_local_subscribers are called, capture this snapshot earlier when
the stream wrapper is first created or before any stream callbacks are invoked,
store it as an instance variable, and then reuse that stored snapshot at this
location rather than recapturing it.
crates/core/src/api/tool.rs (1)

389-451: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Snapshot every tool pipeline stage before the first callback runs.

tool_call_execute runs conditional and request-intercept callbacks before later snapshots are taken by tool_call, execution-chain construction, and tool_call_end. A conditional/request/sanitize/execution callback can now register or deregister later-stage middleware and affect the same in-flight tool call, instead of only subsequent calls. Capture the tool call’s conditional, request-intercept, sanitize-request, execution, sanitize-response, and subscriber snapshots up front, then pass those snapshots through private start/end helpers.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/core/src/api/tool.rs` around lines 389 - 451, Move all snapshot
collection to the very beginning of the tool_call_execute function before any
callbacks execute. Currently, snapshots for conditional execution and request
intercepts are taken at different points in the function. Consolidate the
snapshot collection to capture all stages upfront: conditional execution
guardrails, request-intercept entries, sanitize-request entries, execution
entries, sanitize-response entries, and subscribers. Store these snapshots in a
single data structure at the start of the function, then pass them through the
execution pipeline instead of collecting entries lazily during the
tool_conditional_execution_snapshot_chain and
tool_request_intercepts_snapshot_chain calls. This ensures callbacks cannot
register or deregister middleware that affects the same in-flight tool call.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/core/src/api/llm.rs`:
- Around line 282-299: The snapshot logic shown for request sanitizers in the
`emit_llm_start` function (and similar logic in `llm_call_end_with_behavior` for
response sanitizers) is being captured after callbacks have already run,
allowing those callbacks to mutate registries and affect the snapshots. Move all
snapshot collection (sanitizers, interceptors, execution, and response handlers)
to the managed call entrypoint before any callbacks are executed, then pass
these pre-captured snapshots as parameters into `emit_llm_start` and
`llm_call_end_with_behavior` instead of capturing them internally within those
functions.

In `@crates/core/src/api/tool.rs`:
- Around line 389-451: Move all snapshot collection to the very beginning of the
tool_call_execute function before any callbacks execute. Currently, snapshots
for conditional execution and request intercepts are taken at different points
in the function. Consolidate the snapshot collection to capture all stages
upfront: conditional execution guardrails, request-intercept entries,
sanitize-request entries, execution entries, sanitize-response entries, and
subscribers. Store these snapshots in a single data structure at the start of
the function, then pass them through the execution pipeline instead of
collecting entries lazily during the tool_conditional_execution_snapshot_chain
and tool_request_intercepts_snapshot_chain calls. This ensures callbacks cannot
register or deregister middleware that affects the same in-flight tool call.

In `@crates/core/src/stream.rs`:
- Around line 149-165: The snapshot of response sanitizer registrations and
subscribers is being captured after the finalizer has already executed, but the
finalizer and other stream callbacks can mutate the response sanitizer
registrations before this snapshot is taken. Instead of capturing the snapshot
at lines 154-165 where collect_scope_local_registries and
collect_scope_local_subscribers are called, capture this snapshot earlier when
the stream wrapper is first created or before any stream callbacks are invoked,
store it as an instance variable, and then reuse that stored snapshot at this
location rather than recapturing it.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 96eb9a52-8e38-4a9f-b4c3-7ef7e2535e28

📥 Commits

Reviewing files that changed from the base of the PR and between da638c2 and 3b77121.

📒 Files selected for processing (7)
  • crates/core/src/api/llm.rs
  • crates/core/src/api/runtime/state.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/stream.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/tests/unit/context_tests.rs
📜 Review details
🧰 Additional context used
📓 Path-based instructions (17)
**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use snake_case naming convention for Rust identifiers (e.g., nemo_relay_tool_call)

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for all FFI work since it is Rust work
Run just test-rust to validate FFI changes
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting on FFI work

When Rust files changed as part of Go work, also run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all when Rust files are changed as part of Node work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files are changed as part of Node work
Run just test-rust when Rust files are changed as part of Node work

**/*.rs: Run cargo fmt --all to format all Rust code
Run cargo clippy --workspace --all-targets -- -D warnings to enforce all clippy lints as errors

**/*.rs: Run cargo fmt --all when Rust files changed as part of WebAssembly work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files changed as part of WebAssembly work

**/*.rs: If any Rust code changed, always run just test-rust
If any Rust code changed, also run cargo fmt --all
If any Rust code changed, also run cargo clippy --workspace --all-targets -- -D warnings
Run Rust formatting with cargo fmt --all
Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Use cargo fmt for Rust code formatting
Run cargo clippy -- -D warnings to lint Rust code and treat all warnings as errors
Use Rust snake_case naming convention for Rust identifiers
Include SPDX license header in all Rust source files using double-slash comment syntax
Validate Rust code with uv run pre-commit run --all-files to enforce cargo fmt formatting check, cargo clippy lints, and cargo deny aud...

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/stream.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
{crates/adaptive/**/*.rs,**/*test*.{rs,py,go,ts,js},**/*adaptive*test*.{rs,py,go,ts,js},docs/plugins/adaptive/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Maintain documented and tested validation and report behavior for adaptive surfaces

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/tests/integration/middleware_tests.rs
**/{Cargo.toml,**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Maintain consistency between Rust package names in Cargo.toml and their actual usage across the codebase

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/stream.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
**/*.{h,hpp,c,cpp,rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Ensure FFI header and library naming follows consistent conventions across platform-specific builds

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/stream.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
{crates/core,crates/adaptive}/**/*

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changes to crates/core or crates/adaptive must run the full language matrix

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/stream.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
**/*.{rs,toml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Rust crate names and module prefixes during coordinated rename operations

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/stream.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
crates/core/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-go-binding/SKILL.md)

If the change touched crates/core or shared runtime semantics, also use validate-change for broader validation

crates/core/**/*.rs: Use Json = serde_json::Value in Rust-facing runtime APIs where the existing code expects JSON payloads.
Use Result<T> with FlowError in core runtime paths. Keep errors explicit and binding-appropriate at the wrapper layer.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/stream.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
crates/{core,adaptive}/**

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

If crates/core or crates/adaptive changed, run the full matrix across Rust, Python, Go, Node.js, and WebAssembly

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/stream.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
**/*.{rs,py,js,ts,tsx,jsx,go,sh,toml,yaml,yml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/stream.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Follow binding naming conventions: Rust and Python use snake_case, C FFI exports prefixed nemo_relay_, Go uses PascalCase for public APIs, Node.js uses camelCase.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/stream.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
crates/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

crates/**/*.rs: Keep async behavior on the existing tokio-based model. Bindings should preserve callback and future lifetimes rather than blocking or hiding async work unexpectedly.
Use Json = serde_json::Value in Rust-facing runtime APIs for JSON payload handling.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/stream.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
crates/{core,adaptive}/**/*.rs

⚙️ CodeRabbit configuration file

crates/{core,adaptive}/**/*.rs: Review the Rust runtime for async correctness, scope isolation, middleware ordering, and event lifecycle regressions.
Pay close attention to task-local/thread-local scope propagation, callback lifetimes, stream finalization, and root_uuid isolation.
Public API changes should preserve existing behavior unless tests and docs show the intended migration path.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/stream.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/tests/integration/middleware_tests.rs
**

⚙️ CodeRabbit configuration file

**:

AGENTS.md

This file provides guidance to agents, including Claude Code and OpenAI Codex, when working in this repository.

Project Overview

NeMo Relay is a multi-language agent runtime framework for execution scopes, lifecycle events, middleware, plugins, and observability around tool and LLM calls. The core runtime is Rust. Primary supported bindings are Rust, Python, and Node.js. Go, WebAssembly, and the raw C FFI are experimental and source-first.

The shared runtime model is:

  1. Scope stacks decide where work belongs and which scope-local behavior is visible.
  2. Middleware registries decide what guardrails and intercepts run around managed calls.
  3. Plugins install reusable runtime behavior from configuration.
  4. Events record runtime behavior in ATOF form.
  5. Subscribers and exporters consume events in-process or export them to ATIF, OpenTelemetry, OpenInference, or other backends.

Repository Structure

The repository layout separates the Rust runtime, language bindings, documentation,
integration patches, and agent-facing skills.

crates/
  core/       # Rust core runtime crate, published as nemo-relay
  adaptive/   # Adaptive runtime primitives and plugin components
  python/     # PyO3 native extension for the Python package
  ffi/        # Raw C ABI layer used by downstream bindings such as Go
  node/       # NAPI Node.js binding and JavaScript/TypeScript entry points
  wasm/       # wasm-bindgen WebAssembly binding and JS wrappers
python/
  nemo_relay/  # Python wrapper package: scopes, tools, LLM, middleware, typed helpers, plugins, adaptive helpers
  tests/      # Python tests
go/
  nemo_relay/  # Experimental Go CGo binding and tests
fern/         # Fern documentation site
scripts/      # Stable wrappers and helper scripts; build/test/docs entry points live in justfile
third_party/  # P...

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/stream.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
crates/core/src/api/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Implement behavior first in Rust core API modules: crates/core/src/api/ and related core modules such as crates/core/src/api/runtime/, crates/core/src/codec/, or crates/core/src/json.rs

Files:

  • crates/core/src/api/shared.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/runtime/state.rs
crates/core/src/api/{tool,llm}.rs

📄 CodeRabbit inference engine (.agents/skills/add-middleware/SKILL.md)

Wire the new middleware chain into the execute path in crates/core/src/api/tool.rs or crates/core/src/api/llm.rs at the appropriate pipeline stage

Files:

  • crates/core/src/api/llm.rs
  • crates/core/src/api/tool.rs
crates/core/src/api/runtime/state.rs

📄 CodeRabbit inference engine (.agents/skills/add-middleware/SKILL.md)

crates/core/src/api/runtime/state.rs: Add registry fields as SortedRegistry<GuardrailEntry<T>> or SortedRegistry<Intercept<T>> to NemoRelayContextState in crates/core/src/api/runtime/state.rs
Add chain execution helpers to NemoRelayContextState following the pattern of existing methods like tool_sanitize_request_chain or tool_request_intercepts_chain

Files:

  • crates/core/src/api/runtime/state.rs
🔇 Additional comments (4)
crates/core/tests/integration/middleware_tests.rs (1)

1978-1980: Duplicate: assert the stream finalizer is lock-free too.

The finalizer is in the expected callback set but still does not call assert_middleware_callback_locks_are_free() before returning.

crates/core/src/api/runtime/state.rs (1)

595-1058: Confirm the required Rust/core validation matrix.

This file changes crates/core runtime middleware semantics; please attach the exact validation results for Rust formatting, clippy, Rust tests, pre-commit, and the affected binding matrix.

As per coding guidelines, “Any Rust change must run just test-rust”, “Any Rust change must run cargo fmt --all”, “Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings”, “Validate Rust code with uv run pre-commit run --all-files”, and “Changes to crates/core or crates/adaptive must run the full language matrix.”

Source: Coding guidelines

crates/core/src/api/shared.rs (1)

84-100: LGTM!

crates/core/tests/unit/context_tests.rs (1)

282-284: LGTM!

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
crates/core/src/api/tool.rs (1)

434-449: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Snapshot the whole tool pipeline before running callbacks.

tool_call_execute still snapshots later phases after earlier callbacks have executed. A conditional guardrail can register a request intercept that runs in this same call, a request intercept can register a sanitize-request guardrail observed by tool_call, and the tool callback/execution intercepts can register sanitize-response guardrails observed by tool_call_end. Capture the tool middleware snapshots needed for the managed call before invoking the first callback, then thread those snapshots through the start/end helpers or internal variants. The PR objective promises per-call snapshots where callback registry mutations affect subsequent calls only.

Also applies to: 451-460, 474-484

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/core/src/api/tool.rs` around lines 434 - 449, Move the snapshot
collection logic for tool request intercepts that is currently in the code block
(which reads the scope_stack, collects scope_local_registries for
tool_request_intercepts, gets the global_context, and calls
tool_request_intercept_entries) to execute before any callbacks are invoked in
the tool pipeline. Additionally, capture snapshots for all other tool middleware
phases (sanitize-request guardrails, sanitize-response guardrails, and tool
callback/execution intercepts) at the same early point. Thread these captured
snapshots through the relevant helper methods or internal variants so that
callback registry mutations during execution only affect subsequent calls, not
the current call being processed.
crates/core/src/stream.rs (1)

149-180: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Capture stream END middleware before stream callbacks can mutate it.

emit_end_event snapshots llm_sanitize_response_guardrails after stream execution, collector, and finalizer callbacks have already run. Since this PR makes those callbacks reentrant, they can register a response sanitizer that affects the same stream END event. Capture the END-event entries/subscribers in llm_stream_call_execute before invoking the stream callback path and store them on LlmStreamWrapper for finalization. The PR objective promises per-call snapshots where callback registry mutations affect subsequent calls only.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/core/src/stream.rs` around lines 149 - 180, The snapshot of
llm_sanitize_response_guardrails is currently captured in emit_end_event after
stream callbacks have already executed, allowing those callbacks to mutate the
guardrails and affect the same stream END event. Move the snapshot capture logic
that collects scope_local_registries with llm_sanitize_response_guardrails and
calls collect_event_subscribers and llm_sanitize_response_entries to
llm_stream_call_execute before invoking stream callbacks. Store the captured
entries and subscribers on the LlmStreamWrapper struct, then in the current
emit_end_event code, replace the snapshot collection block with code that
retrieves the pre-captured snapshot from LlmStreamWrapper instead of calling
collect_scope_local_registries and global_context again.
crates/core/src/api/llm.rs (1)

282-299: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Keep LLM sanitize snapshots isolated from earlier callbacks.

These snapshots are still taken at start/end emission time, after request intercepts and execution callbacks have had a chance to re-enter registry APIs. That lets an in-flight LLM callback register a sanitize-request or sanitize-response guardrail and have it affect the same call. Capture the LLM middleware snapshots for the managed call before any middleware/provider callback runs, then pass the snapshots into start/end emission. The PR objective promises per-call snapshots where callback registry mutations affect subsequent calls only.

Also applies to: 426-447

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/core/src/api/llm.rs` around lines 282 - 299, The LLM sanitize
snapshots are currently being captured after request intercepts and execution
callbacks have run, which allows in-flight callbacks to register guardrails that
affect the current call. Move the snapshot capture block that collects
scope_stack, scope_locals, and subscribers (the entire block starting with
current_scope_stack() through llm_sanitize_request_entries) to execute before
any middleware or provider callbacks run. Then pass these pre-captured snapshots
into the NemoRelayContextState::llm_sanitize_request_snapshot_chain call. Apply
the same fix to the corresponding code location around lines 426-447 that also
captures LLM middleware snapshots. This ensures callback registry mutations only
affect subsequent calls, not the current one being processed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/core/tests/integration/middleware_tests.rs`:
- Around line 1623-2042: The tests
test_tool_middleware_callbacks_run_without_registry_or_scope_locks and
test_llm_middleware_callbacks_run_without_registry_or_scope_locks verify that
callbacks are lock-free, but they do not test mutation-isolation or snapshot
semantics (i.e., that registering a new guardrail/intercept from within a
callback does not affect the current execution). Add a callback that registers a
new guardrail/intercept mid-execution, verify that the newly registered
guardrail/intercept does not appear in the current execution's callback sequence
by checking the assertions in assert_middleware_callback_labels, then execute
the tool or llm call again and verify the newly registered guardrail/intercept
now appears in the second execution's callback labels to confirm per-call
snapshot isolation behavior.

---

Outside diff comments:
In `@crates/core/src/api/llm.rs`:
- Around line 282-299: The LLM sanitize snapshots are currently being captured
after request intercepts and execution callbacks have run, which allows
in-flight callbacks to register guardrails that affect the current call. Move
the snapshot capture block that collects scope_stack, scope_locals, and
subscribers (the entire block starting with current_scope_stack() through
llm_sanitize_request_entries) to execute before any middleware or provider
callbacks run. Then pass these pre-captured snapshots into the
NemoRelayContextState::llm_sanitize_request_snapshot_chain call. Apply the same
fix to the corresponding code location around lines 426-447 that also captures
LLM middleware snapshots. This ensures callback registry mutations only affect
subsequent calls, not the current one being processed.

In `@crates/core/src/api/tool.rs`:
- Around line 434-449: Move the snapshot collection logic for tool request
intercepts that is currently in the code block (which reads the scope_stack,
collects scope_local_registries for tool_request_intercepts, gets the
global_context, and calls tool_request_intercept_entries) to execute before any
callbacks are invoked in the tool pipeline. Additionally, capture snapshots for
all other tool middleware phases (sanitize-request guardrails, sanitize-response
guardrails, and tool callback/execution intercepts) at the same early point.
Thread these captured snapshots through the relevant helper methods or internal
variants so that callback registry mutations during execution only affect
subsequent calls, not the current call being processed.

In `@crates/core/src/stream.rs`:
- Around line 149-180: The snapshot of llm_sanitize_response_guardrails is
currently captured in emit_end_event after stream callbacks have already
executed, allowing those callbacks to mutate the guardrails and affect the same
stream END event. Move the snapshot capture logic that collects
scope_local_registries with llm_sanitize_response_guardrails and calls
collect_event_subscribers and llm_sanitize_response_entries to
llm_stream_call_execute before invoking stream callbacks. Store the captured
entries and subscribers on the LlmStreamWrapper struct, then in the current
emit_end_event code, replace the snapshot collection block with code that
retrieves the pre-captured snapshot from LlmStreamWrapper instead of calling
collect_scope_local_registries and global_context again.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 8eb4f540-9b7a-4eb2-abb6-82f7e2f691a2

📥 Commits

Reviewing files that changed from the base of the PR and between 3b77121 and e2c6f39.

📒 Files selected for processing (7)
  • crates/core/src/api/llm.rs
  • crates/core/src/api/runtime/state.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/stream.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/tests/unit/context_tests.rs
📜 Review details
🧰 Additional context used
📓 Path-based instructions (17)
**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use snake_case naming convention for Rust identifiers (e.g., nemo_relay_tool_call)

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for all FFI work since it is Rust work
Run just test-rust to validate FFI changes
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting on FFI work

When Rust files changed as part of Go work, also run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all when Rust files are changed as part of Node work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files are changed as part of Node work
Run just test-rust when Rust files are changed as part of Node work

**/*.rs: Run cargo fmt --all to format all Rust code
Run cargo clippy --workspace --all-targets -- -D warnings to enforce all clippy lints as errors

**/*.rs: Run cargo fmt --all when Rust files changed as part of WebAssembly work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files changed as part of WebAssembly work

**/*.rs: If any Rust code changed, always run just test-rust
If any Rust code changed, also run cargo fmt --all
If any Rust code changed, also run cargo clippy --workspace --all-targets -- -D warnings
Run Rust formatting with cargo fmt --all
Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Use cargo fmt for Rust code formatting
Run cargo clippy -- -D warnings to lint Rust code and treat all warnings as errors
Use Rust snake_case naming convention for Rust identifiers
Include SPDX license header in all Rust source files using double-slash comment syntax
Validate Rust code with uv run pre-commit run --all-files to enforce cargo fmt formatting check, cargo clippy lints, and cargo deny aud...

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/runtime/state.rs
{crates/adaptive/**/*.rs,**/*test*.{rs,py,go,ts,js},**/*adaptive*test*.{rs,py,go,ts,js},docs/plugins/adaptive/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Maintain documented and tested validation and report behavior for adaptive surfaces

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/tests/integration/middleware_tests.rs
**/{Cargo.toml,**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Maintain consistency between Rust package names in Cargo.toml and their actual usage across the codebase

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/runtime/state.rs
**/*.{h,hpp,c,cpp,rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Ensure FFI header and library naming follows consistent conventions across platform-specific builds

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/runtime/state.rs
{crates/core,crates/adaptive}/**/*

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changes to crates/core or crates/adaptive must run the full language matrix

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/runtime/state.rs
**/*.{rs,toml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Rust crate names and module prefixes during coordinated rename operations

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/runtime/state.rs
crates/core/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-go-binding/SKILL.md)

If the change touched crates/core or shared runtime semantics, also use validate-change for broader validation

crates/core/**/*.rs: Use Json = serde_json::Value in Rust-facing runtime APIs where the existing code expects JSON payloads.
Use Result<T> with FlowError in core runtime paths. Keep errors explicit and binding-appropriate at the wrapper layer.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/runtime/state.rs
crates/{core,adaptive}/**

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

If crates/core or crates/adaptive changed, run the full matrix across Rust, Python, Go, Node.js, and WebAssembly

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/runtime/state.rs
**/*.{rs,py,js,ts,tsx,jsx,go,sh,toml,yaml,yml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/runtime/state.rs
**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Follow binding naming conventions: Rust and Python use snake_case, C FFI exports prefixed nemo_relay_, Go uses PascalCase for public APIs, Node.js uses camelCase.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/runtime/state.rs
crates/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

crates/**/*.rs: Keep async behavior on the existing tokio-based model. Bindings should preserve callback and future lifetimes rather than blocking or hiding async work unexpectedly.
Use Json = serde_json::Value in Rust-facing runtime APIs for JSON payload handling.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/runtime/state.rs
crates/{core,adaptive}/**/*.rs

⚙️ CodeRabbit configuration file

crates/{core,adaptive}/**/*.rs: Review the Rust runtime for async correctness, scope isolation, middleware ordering, and event lifecycle regressions.
Pay close attention to task-local/thread-local scope propagation, callback lifetimes, stream finalization, and root_uuid isolation.
Public API changes should preserve existing behavior unless tests and docs show the intended migration path.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/runtime/state.rs
{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/tests/integration/middleware_tests.rs
**

⚙️ CodeRabbit configuration file

**:

AGENTS.md

This file provides guidance to agents, including Claude Code and OpenAI Codex, when working in this repository.

Project Overview

NeMo Relay is a multi-language agent runtime framework for execution scopes, lifecycle events, middleware, plugins, and observability around tool and LLM calls. The core runtime is Rust. Primary supported bindings are Rust, Python, and Node.js. Go, WebAssembly, and the raw C FFI are experimental and source-first.

The shared runtime model is:

  1. Scope stacks decide where work belongs and which scope-local behavior is visible.
  2. Middleware registries decide what guardrails and intercepts run around managed calls.
  3. Plugins install reusable runtime behavior from configuration.
  4. Events record runtime behavior in ATOF form.
  5. Subscribers and exporters consume events in-process or export them to ATIF, OpenTelemetry, OpenInference, or other backends.

Repository Structure

The repository layout separates the Rust runtime, language bindings, documentation,
integration patches, and agent-facing skills.

crates/
  core/       # Rust core runtime crate, published as nemo-relay
  adaptive/   # Adaptive runtime primitives and plugin components
  python/     # PyO3 native extension for the Python package
  ffi/        # Raw C ABI layer used by downstream bindings such as Go
  node/       # NAPI Node.js binding and JavaScript/TypeScript entry points
  wasm/       # wasm-bindgen WebAssembly binding and JS wrappers
python/
  nemo_relay/  # Python wrapper package: scopes, tools, LLM, middleware, typed helpers, plugins, adaptive helpers
  tests/      # Python tests
go/
  nemo_relay/  # Experimental Go CGo binding and tests
fern/         # Fern documentation site
scripts/      # Stable wrappers and helper scripts; build/test/docs entry points live in justfile
third_party/  # P...

Files:

  • crates/core/tests/unit/context_tests.rs
  • crates/core/src/stream.rs
  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/tests/integration/middleware_tests.rs
  • crates/core/src/api/runtime/state.rs
crates/core/src/api/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Implement behavior first in Rust core API modules: crates/core/src/api/ and related core modules such as crates/core/src/api/runtime/, crates/core/src/codec/, or crates/core/src/json.rs

Files:

  • crates/core/src/api/shared.rs
  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
  • crates/core/src/api/runtime/state.rs
crates/core/src/api/{tool,llm}.rs

📄 CodeRabbit inference engine (.agents/skills/add-middleware/SKILL.md)

Wire the new middleware chain into the execute path in crates/core/src/api/tool.rs or crates/core/src/api/llm.rs at the appropriate pipeline stage

Files:

  • crates/core/src/api/tool.rs
  • crates/core/src/api/llm.rs
crates/core/src/api/runtime/state.rs

📄 CodeRabbit inference engine (.agents/skills/add-middleware/SKILL.md)

crates/core/src/api/runtime/state.rs: Add registry fields as SortedRegistry<GuardrailEntry<T>> or SortedRegistry<Intercept<T>> to NemoRelayContextState in crates/core/src/api/runtime/state.rs
Add chain execution helpers to NemoRelayContextState following the pattern of existing methods like tool_sanitize_request_chain or tool_request_intercepts_chain

Files:

  • crates/core/src/api/runtime/state.rs
🔇 Additional comments (5)
crates/core/src/api/runtime/state.rs (2)

595-633: LGTM!

Also applies to: 635-673, 763-811, 842-878, 880-916, 1003-1058


595-1058: Run and record the required Rust/core validation.

This core runtime refactor changes shared middleware semantics; please confirm cargo fmt --all, cargo clippy --workspace --all-targets -- -D warnings, just test-rust, validate-change, and the affected/full language matrix passed. As per coding guidelines, "**/*.rs: Any Rust change must run just test-rust", "Run Rust formatting with cargo fmt --all", "Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings", "crates/core/**/*.rs: If the change touched crates/core or shared runtime semantics, also use validate-change for broader validation", and "Changes to crates/core or crates/adaptive must run the full language matrix".

Source: Coding guidelines

crates/core/src/api/tool.rs (1)

214-527: Confirm the required Rust validation matrix ran.

Please make sure the PR evidence includes cargo fmt --all, cargo clippy --workspace --all-targets -- -D warnings, just test-rust, and the required binding matrix for crates/core runtime changes. As per coding guidelines, “Any Rust change must run just test-rust,” “Run cargo fmt --all,” “Run cargo clippy --workspace --all-targets -- -D warnings,” and “Changes to crates/core or crates/adaptive must run the full language matrix.”

Source: Coding guidelines

crates/core/src/api/shared.rs (1)

84-100: LGTM!

crates/core/tests/unit/context_tests.rs (1)

282-284: LGTM!

Comment thread crates/core/tests/integration/middleware_tests.rs
Signed-off-by: Minh Vu <vuhoangminh97@gmail.com>
@fallintoplace fallintoplace force-pushed the fix/middleware-callback-locks branch from e2c6f39 to cea35c0 Compare June 19, 2026 19:32
@github-actions github-actions Bot added size:XL PR is extra large and removed size:L PR is large labels Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug issue describes bug; PR fixes bug lang:rust PR changes/introduces Rust code size:XL PR is extra large

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant