feat: add topology-aware adaptive controls by teerthsharma · Pull Request #282 · NVIDIA/NeMo-Relay

teerthsharma · 2026-06-18T05:55:43Z

Overview

This pull request introduces POC/aether: a topology-aware adaptive control layer for NeMo Relay's adaptive runtime. The contribution treats an agent run as a sequence of observable runtime states and uses lightweight topological summaries to decide when convergence, drift, and hint-governor decisions should change execution behavior.

At epoch t, let the runtime observation be

x_t = (scope_t, tool_t, latency_t, error_t, retry_t, branch_t, metadata_t)

and let S_t be the bounded sketch of recent observations retained by the adaptive runtime. The topology crate maps S_t to a compact signature

beta_t = (beta_0(t), beta_1(t))

where beta_0 approximates connected components in the local observation cloud and beta_1 approximates loop-like structure. This PR deliberately keeps that contract as a deterministic, bounded runtime approximation rather than claiming exact persistent homology.

The adaptive layer then uses three scalar decisions:

d_beta(t) = || beta_t - beta_{t-1} ||_1
D_t       = normalized_drift(S_t, S_{t-1})
E_t       = observed_error_or_loss_t

A convergence window of size W is accepted when

forall i in [t-W+1, t]: d_beta(i) = 0
and max(D_i) <= epsilon_d
and max(E_i) <= epsilon_e

A drift-aware tool-parallelism plan is invalidated when

D_t >= tau_d

and the hint-governor threshold evolves by the finite-state update

epsilon_{t+1} = clamp(epsilon_t + alpha * deviation_delta_t + beta * derivative_t,
                      epsilon_min,
                      epsilon_max)

with explicit guards for NaN, infinity, zero/negative time deltas, and saturating Betti-distance arithmetic. The practical result is a reviewable adaptive surface that can react to runtime shape changes without rewriting agent frameworks or changing the scope/event model.

I confirm this contribution is my own work, or I have the right to submit it under this project's license.
I searched existing issues and open pull requests, and this does not duplicate existing work.

Details

This change adds topology-aware controls across the Rust source of truth and the primary bindings:

Rust adaptive runtime: adds convergence, drift, and governor config to the adaptive plugin contract, validates non-finite and invalid values, connects topology-derived decisions to adaptive hints, tool parallelism, ACG stability, and runtime feature reporting.
Rust topology primitives: hardens ConvergenceDetector and GeometricGovernor against public non-finite inputs, saturating arithmetic, and invalid windows/thresholds.
Python binding: exposes GovernorConfig, DriftConfig, and ConvergenceConfig through python/nemo_relay/adaptive.py and the .pyi stubs.
Node binding: mirrors the topology-aware adaptive config helpers in crates/node/adaptive.js and adaptive.d.ts.
Go binding: mirrors the adaptive config structs and constructors in both the root package and go/nemo_relay/adaptive package.
Documentation: updates the adaptive plugin docs to describe topology-inspired convergence, drift invalidation, and governor controls without overstating the approximation as exact persistent homology.
Validation support: adds focused unit/integration tests and a Criterion convergence benchmark.
CI hygiene: updates the ty pre-commit hook with --force-exclude so vendored third-party snapshots remain outside type-check scope when the hook is invoked as a full-project command.

The control equations are intentionally small:

signature:      beta_t = f(S_t)
distance:       d_beta(t) = |beta_0(t)-beta_0(t-1)| + |beta_1(t)-beta_1(t-1)|
convergence:    C_t = 1[d_beta(t-W+1..t)=0 and D<=epsilon_d and E<=epsilon_e]
drift:          R_t = 1[D_t >= tau_d]
governor:       epsilon_{t+1} = clamp(epsilon_t + alpha*e_t + beta*(e_t-e_{t-1})/dt)

This keeps the runtime behavior inspectable: every decision reduces to bounded state, finite scalar thresholds, and existing NeMo Relay scope/plugin semantics.

Local validation copied from the real CI workflows and recipes:

cargo fmt --all
cargo clippy --workspace --all-targets -- -D warnings
cargo test -p nemo-relay-adaptive-topology
cargo test -p nemo-relay-adaptive --lib
cargo test -p nemo-relay-adaptive --test topology_convergence
cargo test -p nemo-relay-adaptive --test tool_parallelism_plan
cargo test -p nemo-relay-adaptive --test runtime_integration
cargo test -p nemo-relay-cli --bin nemo-relay -- --test-threads=1
just --set ci true --set output_dir . test-python
just --set ci true --set output_dir . test-node
just --set ci true --set output_dir . test-wasm
cargo nextest run --workspace --profile ci --no-fail-fast
cargo llvm-cov report --ignore-filename-regex '.*/tests/.*\.rs$' --cobertura
WSL/Linux: cd go/nemo_relay && go test -v -coverprofile=coverage.out ./...
WSL/Linux: cd go/nemo_relay && go vet ./...
WSL/Linux: cargo deny check
WSL/Linux: cargo about generate --format json -m Cargo.toml --all-features --workspace --fail
uv run pre-commit run ty --all-files
uv run pre-commit run --files <changed files>

Observed results:

Rust nextest workspace: 2082 passed / 2082 run
Python tests: 389 passed
PyO3 Rust tests: 62 passed
Node tests: 244 passed
WASM JS tests: 80 passed
Go tests: passed across ./...
Go vet: passed across ./...
cargo deny: advisories ok, bans ok, licenses ok, sources ok
cargo about: generated non-empty attribution JSON

Windows-only notes from local reproduction:

GitHub's Go CI job treats Windows as continue-on-error. Local Windows CGo also required extra MSVC/MinGW linker setup, so the authoritative Go validation above was run under WSL/Linux with Go 1.26.1.
The Windows cargo-deny binary panicked while parsing the local advisory DB entry RUSTSEC-2020-0105; the same cargo-deny version passed under WSL/Linux.
The repository shell attribution wrapper is LF-sensitive under WSL when the Windows checkout presents CRLF, so the underlying attribution generator was run directly and updated ATTRIBUTIONS-Rust.md.

Where should the reviewer start?

Start with the runtime contract and validation path:

crates/adaptive/src/config.rs
crates/adaptive/src/plugin_component.rs
crates/adaptive/src/runtime/validation.rs
crates/adaptive/src/adaptive_hints_intercept.rs
crates/adaptive/src/tool_parallelism_learner.rs
crates/adaptive/tests/unit/config_tests.rs
crates/adaptive/tests/unit/plugin_component_tests.rs
crates/adaptive/tests/unit/adaptive_hints_intercept_tests.rs
crates/adaptive/tests/unit/tool_parallelism_learner_tests.rs

Then check binding parity:

python/nemo_relay/adaptive.py
python/nemo_relay/adaptive.pyi
crates/node/adaptive.js
crates/node/adaptive.d.ts
go/nemo_relay/adaptive.go
go/nemo_relay/adaptive/adaptive.go

The core safety decision is in the finite-input handling for convergence and governor math:

crates/adaptive-topology/src/convergence.rs
crates/adaptive-topology/src/governor.rs

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Relates to: none

Summary by CodeRabbit

New Features
- Added topology-aware adaptive controls: deterministic convergence detection for early stopping, drift detection to invalidate stale tool plans, and a hint governor for load-shedding.
- Extended adaptive configuration with new governor, drift, and convergence options across adaptive hints, tool parallelism, and ACG.
- Added Python and Node.js bindings/helpers for the new topology primitives.
Bug Fixes
- Stability results now reliably default the new converged flag to false when missing.
Documentation
- Updated adaptive plugin docs and configuration examples for the new topology-aware settings.
Tests
- Added integration/unit tests and a convergence benchmark, including Python primitive coverage.

copy-pr-bot · 2026-06-18T05:55:46Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-06-18T05:55:59Z

Walkthrough

Introduces a new nemo-relay-adaptive-topology Rust crate with allocation-free topology primitives (GeometricGovernor, DriftDetector, ConvergenceDetector, manifold/geometry helpers, topology verification). These primitives are wired into the adaptive plugin: ACG learner gains optional convergence early-stop via Betti-number stability, hint intercept gains governor-based load shedding, and tool-parallelism learner gains drift-based plan invalidation. Config types, validation, and SDK bindings (Python PyO3, Node.js, Go) are extended across all languages. Tests, benchmark, and documentation are added throughout.

Changes

Topology-Aware Adaptive Controls

Layer / File(s)	Summary
adaptive-topology crate: primitives and module structure `Cargo.toml`, `crates/adaptive-topology/Cargo.toml`, `crates/adaptive-topology/README.md`, `crates/adaptive-topology/src/{governor,drift,convergence,topology,manifold,geometry,lib}.rs`	New `nemo-relay-adaptive-topology` workspace crate with six modules. `GeometricGovernor` implements PD-style epsilon adaptation to regulate effective tick rate. `DriftDetector<D>` tracks centroid velocity and detects sudden drift via L2 distance between predicted and observed positions. `ConvergenceDetector` uses Betti-number stability, drift monotonicity, and error threshold to gate early-stop convergence. `TopologicalShape` computes β₀/β₁ signatures from byte sequences and verifies density/loop bounds. `ManifoldPoint`, `TimeDelayEmbedder`, `SparseAttentionGraph`, and `GeometricConcentrator` provide manifold and point-neighborhood primitives. `HierarchicalBlockTree` implements 3-level, fan-in-4 block hierarchies for pruning queries. Includes crate-private serde helpers for const-generic array serialization and serde round-trip test coverage.
Configuration schema: typed GovernorConfig, DriftConfig, ConvergenceConfig `crates/adaptive/src/config.rs`, `crates/adaptive/src/lib.rs`, `crates/node/adaptive.{d.ts,js}`, `go/nemo_relay/adaptive.go`, `go/nemo_relay/adaptive/adaptive.go`, `python/nemo_relay/adaptive.py`, `python/nemo_relay/adaptive.pyi`	Adds `GovernorConfig`, `DriftConfig`, and `ConvergenceConfig` typed structs across Rust, Node.js, Go, and Python. Each struct includes an `enabled` flag and numeric tuning parameters (epsilon, threshold, stability_window). Wires optional fields into `AdaptiveConfig`, `AdaptiveHintsComponentConfig`, `ToolParallelismComponentConfig`, and `AcgComponentConfig`. Includes editor schema registration, serde/JSON field defaults, and config builder factory functions in each language.
Stability field and ACG learner convergence integration `crates/adaptive/src/acg/stability.rs`, `crates/adaptive/src/acg_learner.rs`	Adds `converged: bool` field to `StabilityAnalysisResult` with `#[serde(default)]`. Extends `AcgLearner` with `new_with_convergence` constructor, per-profile `ConvergenceDetector` registry (Arc-RwLock protected), `record_stability_epoch` helper that maps stability metrics to `BettiNumbers` and gates convergence on minimum epoch count. Implements fast-path caching: when backend-loaded stability is already `converged`, reuses cached counts and skips re-analysis.
Governor load shedding and drift-based plan invalidation `crates/adaptive/src/adaptive_hints_intercept.rs`, `crates/adaptive/src/tool_parallelism_learner.rs`	`HintGovernor` wraps `GeometricGovernor` with `last_seen` timestamp tracking. Hint injection becomes conditional via `should_inject_hints`: forced when manual latency-sensitivity override is present, otherwise governor-gated if enabled (defaulting to inject when governor is absent). `ToolParallelismLearner` gains `new_with_drift` constructor and `record_cohort_drift` helper; when drift exceeds configured threshold, stored execution plan is discarded and rebuilt from empty state rather than loaded.
Runtime integration: features, validation, plugin config `crates/adaptive/src/runtime/features.rs`, `crates/adaptive/src/runtime/validation.rs`, `crates/adaptive/src/plugin_component.rs`, `crates/adaptive/src/lib.rs`, `crates/adaptive/Cargo.toml`	Threads topology config from runtime through `TelemetryFeature::new` → `build_learners`, switching learner constructors to `new_with_drift` and `new_with_convergence` variants. Adds `validate_convergence` (enforces epsilon positive-finite and stability_window ≥ 3) and `validate_positive_finite` validation helpers. Expands plugin_component field allowlists with governor/drift/convergence nested validation. Introduces `GLOBAL_RUNTIME_TEST_MUTEX` for coordinated test synchronization.
Python PyO3 bindings and adaptive_topology module `crates/python/Cargo.toml`, `crates/python/src/{lib.rs,py_adaptive_topology.rs}`, `python/nemo_relay/{adaptive_topology.py,adaptive_topology.pyi,__init__.py,__init__.pyi}`, `python/nemo_relay/_native.pyi`	Implements PyO3 wrapper classes `PyGeometricGovernor`, `PyConvergenceDetector`, and `PyDriftDetector` (3D), each exposing constructor, adapt/record/update methods, state query getters, reset, and `__repr__` formatting. Registers wrappers into `_native` module. Exposes new `nemo_relay.adaptive_topology` Python module re-exporting topology classes with matching type stubs.
Tests, benchmarks, and fixture updates `crates/adaptive/benches/convergence_bench.rs`, `crates/adaptive/tests/integration/{topology_convergence,tool_parallelism_plan}_tests.rs`, `crates/adaptive/tests/unit/`, `crates/python/tests/coverage/`, `crates/node/tests/adaptive_tests.mjs`, `go/nemo_relay/adaptive_test.go`, `python/tests/test_adaptive*.py`	Adds Criterion convergence benchmark comparing observation counts with/without convergence detection. Adds topology_convergence integration test validating convergence timing (after stability_window, before observation_window exhausted) and cached reuse. Adds drift invalidation test verifying plan rebuild on cohort topology drift. Adds hints governor unit test (shed vs. manual override). Updates all `StabilityAnalysisResult` fixtures to set `converged: false`. Migrates runtime tests to `GLOBAL_RUNTIME_TEST_MUTEX` for isolation.
User-facing documentation `crates/adaptive/README.md`, `docs/adaptive-plugin/{about,acg,adaptive-hints,configuration}.mdx`	Extends README with topology-aware capability bullet. Adds ACG convergence field tables (epsilon, stability_window with defaults and constraints), hints governor field tables, and drift field tables. Includes per-language configuration examples (plugins.toml, Python, Node.js, Rust) showing topology settings disabled by default. Adds new "Topology-Aware Controls" section with defaults reference table.

Pre-commit Configuration Adjustment

Layer / File(s)	Summary
ty hook exclude pattern update `.pre-commit-config.yaml`	Changes `--exclude` patterns in `ty` hook from `**`-style globs to directory-suffix style (`docs/`, `fern/`, `third_party/`, `examples/`, `.cache/`, `.claude/`). Adds `--force-exclude` flag before exclude patterns.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant AcgLearner
  participant ConvergenceDetector
  participant Backend

  Client->>AcgLearner: process_run(run_record)
  AcgLearner->>Backend: load_stability(profile_key)
  alt stability.converged == true
    Backend-->>AcgLearner: cached converged result
    AcgLearner-->>Client: reuse cached (skip re-analysis)
  else not converged
    AcgLearner->>AcgLearner: analyze_stability(window)
    AcgLearner->>ConvergenceDetector: record_epoch(BettiNumbers, drift, error)
    ConvergenceDetector-->>AcgLearner: is_converged, epoch_count
    alt detector confirms convergence
      AcgLearner->>Backend: store_stability(converged=true)
    else still learning
      AcgLearner->>Backend: store_stability(converged=false)
    end
    Backend-->>AcgLearner: updated result
    AcgLearner-->>Client: new stability result
  end

sequenceDiagram
  participant LLMClient
  participant AdaptiveHintsIntercept
  participant HintGovernor
  participant GeometricGovernor
  participant HotCache

  LLMClient->>AdaptiveHintsIntercept: intercept_request(request)
  AdaptiveHintsIntercept->>HotCache: lookup_hints(agent_id)
  HotCache-->>AdaptiveHintsIntercept: AgentHints | None
  AdaptiveHintsIntercept->>AdaptiveHintsIntercept: should_inject_hints(hints)
  alt manual latency_sensitivity override set
    AdaptiveHintsIntercept-->>LLMClient: inject unconditionally
  else governor enabled
    AdaptiveHintsIntercept->>HintGovernor: allow(latency_sensitivity)
    HintGovernor->>GeometricGovernor: should_trigger(deviation)
    GeometricGovernor-->>HintGovernor: bool (epsilon check)
    alt should_trigger true
      HintGovernor->>GeometricGovernor: adapt(observed_rate, dt)
      HintGovernor-->>AdaptiveHintsIntercept: allowed=true
    else blocked by governor
      HintGovernor-->>AdaptiveHintsIntercept: allowed=false
    end
    AdaptiveHintsIntercept-->>LLMClient: inject if allowed
  else no governor
    AdaptiveHintsIntercept-->>LLMClient: inject hints
  end

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	Title follows Conventional Commits format with lowercase type, no scope, concise imperative summary under 72 characters, and accurately reflects the main change of adding topology-aware adaptive controls.
Description check	✅ Passed	Description includes overview, detailed change explanation, reviewer guidance, and confirmation checkboxes. All required template sections are present and substantive.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/adaptive/tests/unit/storage_tests.rs (1)
82-94: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Assert converged in stability round-trip tests.

The fixture now includes converged, but the round-trip test only checks stable_prefix_length and total_observations. Add an assertion for loaded_stability.converged to lock this storage contract, since runtime behavior reads this field.
Proposed test assertion
     assert_eq!(loaded_stability.stable_prefix_length, 1);
     assert_eq!(loaded_stability.total_observations, 3);
+    assert!(!loaded_stability.converged);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/adaptive/tests/unit/storage_tests.rs` around lines 82 - 94, In the
round-trip test that uses the sample_stability fixture, add an assertion to
verify that loaded_stability.converged equals the value set in the
sample_stability function (which is false). This assertion should be added
alongside the existing assertions for stable_prefix_length and
total_observations to ensure the converged field is properly persisted and
loaded during the serialization round-trip.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/adaptive-topology/src/convergence.rs`:
- Around line 150-156: The stability_window parameter in the new method is only
being clamped to a minimum value using max(MIN_STABILITY_WINDOW), but is not
clamped to a maximum value. Since the RingBuffer instances (betti_history,
drift_history, error_history) have a fixed capacity of MAX_HISTORY (32), if
stability_window exceeds this capacity, methods like is_betti_stable and
is_drift_decreasing will never satisfy their length checks, making
topology-based convergence unreachable. Apply both minimum and maximum clamping
to stability_window by using a method that clamps it between
MIN_STABILITY_WINDOW and MAX_HISTORY.
- Line 166: The `epoch` field is typed as `u32` and the increment operation on
line 166 (self.epoch += 1) can cause integer overflow and wraparound in
long-running detectors, breaking epoch-based gating and diagnostics. Change the
type of the `epoch` field from `u32` to `u64` throughout the convergence module
to prevent wraparound and ensure reliable epoch tracking for the lifetime of the
detector.

In `@crates/adaptive-topology/src/geometry.rs`:
- Around line 252-257: The pruning_ratio method is counting true values across
the entire active_mask array, but only the first self.counts[0] entries
represent actual populated level-0 blocks. When the mask contains true entries
beyond self.counts[0], this causes the calculation to produce invalid ratios.
Restrict the active count calculation to only the first self.counts[0] elements
of active_mask by using slice notation to iterate over
active_mask[..self.counts[0]] instead of the full active_mask.

In `@crates/adaptive-topology/src/manifold.rs`:
- Around line 91-100: The assertion in the `new` method uses `D * tau` as the
validation threshold, but the actual embedding definition only requires `(D - 1)
* tau + 1` samples. Update the assertion logic to use the correct formula `(D -
1) * tau + 1` instead of `D * tau`, and also apply the same correction to the
embed readiness threshold check around lines 121-123 to ensure consistency and
avoid unnecessarily delaying valid embeddings.
- Around line 239-246: The issue in the compute_betti_0 function is that nodes
are not marked as visited when pushed onto the stack, only when popped, causing
the same neighbor to be pushed multiple times. Under dense connectivity, this
wastes the fixed-size stack space and can cause the stack to overflow, skipping
reachable nodes and producing incorrect Betti number calculations. Fix this by
marking a neighbor as visited immediately when it is pushed onto the stack in
the section where stack[stack_top] = neighbor is executed, rather than deferring
the visited marking until pop-time, to prevent duplicate pushes of the same
neighbor.

In `@crates/adaptive/src/acg_learner.rs`:
- Around line 185-208: The issue is that store_stability is being called before
store_observations, which means if observation storage fails after stability is
marked as converged, the next run will skip observation repair permanently. To
fix this, locate all places where store_stability and store_observations are
called together (including the instances at lines 223-229 and 235-237 mentioned
in the comment), and reorder these calls so that store_observations is always
called before store_stability. This ensures that if observation storage fails,
the profile won't yet be marked as converged and can be retried on the next run.

In `@python/nemo_relay/_native.pyi`:
- Around line 1235-1236: The DriftDetector.update method stub currently accepts
Sequence[float] for the centroid parameter, but the native binding requires a
fixed 3-element array. Change the centroid parameter type from Sequence[float]
to a fixed-size type representation (such as a tuple of exactly three floats) to
match the native binding's expectations. Additionally, update the docstring to
explicitly clarify that the centroid must be a 3-dimensional coordinate with
exactly three float values.

---

Outside diff comments:
In `@crates/adaptive/tests/unit/storage_tests.rs`:
- Around line 82-94: In the round-trip test that uses the sample_stability
fixture, add an assertion to verify that loaded_stability.converged equals the
value set in the sample_stability function (which is false). This assertion
should be added alongside the existing assertions for stable_prefix_length and
total_observations to ensure the converged field is properly persisted and
loaded during the serialization round-trip.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 8b2fd9ec-2173-410e-a7fa-73fc991c831b

📥 Commits

Reviewing files that changed from the base of the PR and between d5c2407 and da2470a.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (66)

.pre-commit-config.yaml
ATTRIBUTIONS-Rust.md
Cargo.toml
crates/adaptive-topology/Cargo.toml
crates/adaptive-topology/README.md
crates/adaptive-topology/src/convergence.rs
crates/adaptive-topology/src/drift.rs
crates/adaptive-topology/src/geometry.rs
crates/adaptive-topology/src/governor.rs
crates/adaptive-topology/src/lib.rs
crates/adaptive-topology/src/manifold.rs
crates/adaptive-topology/src/topology.rs
crates/adaptive/Cargo.toml
crates/adaptive/README.md
crates/adaptive/benches/convergence_bench.rs
crates/adaptive/src/acg/stability.rs
crates/adaptive/src/acg_learner.rs
crates/adaptive/src/adaptive_hints_intercept.rs
crates/adaptive/src/config.rs
crates/adaptive/src/lib.rs
crates/adaptive/src/plugin_component.rs
crates/adaptive/src/runtime/features.rs
crates/adaptive/src/runtime/validation.rs
crates/adaptive/src/tool_parallelism_learner.rs
crates/adaptive/tests/integration/runtime_integration_tests.rs
crates/adaptive/tests/integration/tool_parallelism_plan_tests.rs
crates/adaptive/tests/integration/topology_convergence_tests.rs
crates/adaptive/tests/unit/acg/economics_internal_tests.rs
crates/adaptive/tests/unit/acg/economics_policy_tests.rs
crates/adaptive/tests/unit/acg/multi_breakpoint_tests.rs
crates/adaptive/tests/unit/acg_component_tests.rs
crates/adaptive/tests/unit/adaptive_hints_intercept_tests.rs
crates/adaptive/tests/unit/cache_diagnostics_tests.rs
crates/adaptive/tests/unit/config_tests.rs
crates/adaptive/tests/unit/intercepts_tests.rs
crates/adaptive/tests/unit/plugin_component_tests.rs
crates/adaptive/tests/unit/runtime_features_tests.rs
crates/adaptive/tests/unit/runtime_tests.rs
crates/adaptive/tests/unit/storage_memory_internal_tests.rs
crates/adaptive/tests/unit/storage_tests.rs
crates/adaptive/tests/unit/tool_parallelism_learner_tests.rs
crates/adaptive/tests/unit/types_tests.rs
crates/node/adaptive.d.ts
crates/node/adaptive.js
crates/node/tests/adaptive_tests.mjs
crates/python/Cargo.toml
crates/python/src/lib.rs
crates/python/src/py_adaptive_topology.rs
crates/python/tests/coverage/py_storage_coverage_tests.rs
docs/adaptive-plugin/about.mdx
docs/adaptive-plugin/acg.mdx
docs/adaptive-plugin/adaptive-hints.mdx
docs/adaptive-plugin/configuration.mdx
go/nemo_relay/adaptive.go
go/nemo_relay/adaptive/adaptive.go
go/nemo_relay/adaptive_test.go
python/nemo_relay/__init__.py
python/nemo_relay/__init__.pyi
python/nemo_relay/_native.pyi
python/nemo_relay/adaptive.py
python/nemo_relay/adaptive.pyi
python/nemo_relay/adaptive_topology.py
python/nemo_relay/adaptive_topology.pyi
python/tests/test_adaptive.py
python/tests/test_adaptive_config.py
python/tests/test_adaptive_topology.py

coderabbitai · 2026-06-18T06:11:24Z

+    pub fn new(epsilon: f64, stability_window: usize) -> Self {
+        Self {
+            betti_history: RingBuffer::new(),
+            drift_history: RingBuffer::new(),
+            error_history: RingBuffer::new(),
+            stability_window: stability_window.max(MIN_STABILITY_WINDOW),
+            epsilon: sanitize_positive(epsilon, DEFAULT_EPSILON),


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Clamp stability_window to ring-buffer capacity.

At Line 155, only a minimum clamp is applied. If callers pass a value larger than MAX_HISTORY (32), is_betti_stable and is_drift_decreasing can never satisfy their length checks, so topology-based convergence becomes unreachable.

Proposed fix

- stability_window: stability_window.max(MIN_STABILITY_WINDOW), + stability_window: stability_window.clamp(MIN_STABILITY_WINDOW, MAX_HISTORY),

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/adaptive-topology/src/convergence.rs` around lines 150 - 156, The stability_window parameter in the new method is only being clamped to a minimum value using max(MIN_STABILITY_WINDOW), but is not clamped to a maximum value. Since the RingBuffer instances (betti_history, drift_history, error_history) have a fixed capacity of MAX_HISTORY (32), if stability_window exceeds this capacity, methods like is_betti_stable and is_drift_decreasing will never satisfy their length checks, making topology-based convergence unreachable. Apply both minimum and maximum clamping to stability_window by using a method that clamps it between MIN_STABILITY_WINDOW and MAX_HISTORY.

coderabbitai · 2026-06-18T06:11:25Z

+        self.betti_history.push(betti);
+        self.drift_history.push(sanitize_non_negative(drift));
+        self.error_history.push(sanitize_non_negative(error));
+        self.epoch += 1;


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Prevent epoch wraparound in long-running detectors.

Line 166 uses u32 += 1. This can overflow and wrap, which can break downstream epoch-based gating and diagnostics.

Proposed fix

- self.epoch += 1; + self.epoch = self.epoch.saturating_add(1);

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/adaptive-topology/src/convergence.rs` at line 166, The `epoch` field is typed as `u32` and the increment operation on line 166 (self.epoch += 1) can cause integer overflow and wraparound in long-running detectors, breaking epoch-based gating and diagnostics. Change the type of the `epoch` field from `u32` to `u64` throughout the convergence module to prevent wraparound and ensure reliable epoch tracking for the lifetime of the detector.

coderabbitai · 2026-06-18T06:11:25Z

+    pub fn pruning_ratio(&self, active_mask: &[bool; MAX_BLOCKS]) -> f64 {
+        if self.counts[0] == 0 {
+            return 0.0;
+        }
+        let active = active_mask.iter().filter(|&&x| x).count();
+        1.0 - (active as f64 / self.counts[0] as f64)


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Restrict pruning math to populated level-0 blocks.

At Line 256, active is counted across all MAX_BLOCKS. For caller-provided masks, true entries beyond self.counts[0] can produce invalid ratios (including negatives). Count only the tracked prefix.

Proposed fix

- let active = active_mask.iter().filter(|&&x| x).count(); + let active = active_mask[..self.counts[0]] + .iter() + .filter(|&&x| x) + .count();

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

pub fn pruning_ratio(&self, active_mask: &[bool; MAX_BLOCKS]) -> f64 {

if self.counts[0] == 0 {

return 0.0;

}

let active = active_mask.iter().filter(|&&x| x).count();

1.0 - (active as f64 / self.counts[0] as f64)

pub fn pruning_ratio(&self, active_mask: &[bool; MAX_BLOCKS]) -> f64 {

if self.counts[0] == 0 {

return 0.0;

}

let active = active_mask[..self.counts[0]]

.iter()

.filter(|&&x| x)

.count();

1.0 - (active as f64 / self.counts[0] as f64)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/adaptive-topology/src/geometry.rs` around lines 252 - 257, The pruning_ratio method is counting true values across the entire active_mask array, but only the first self.counts[0] entries represent actual populated level-0 blocks. When the mask contains true entries beyond self.counts[0], this causes the calculation to produce invalid ratios. Restrict the active count calculation to only the first self.counts[0] elements of active_mask by using slice notation to iterate over active_mask[..self.counts[0]] instead of the full active_mask.

coderabbitai · 2026-06-18T06:11:25Z

+    /// Panics if `D * tau` exceeds `EMBED_BUFFER_CAPACITY`. This is a
+    /// programming error, not a runtime failure.
+    pub fn new(tau: usize) -> Self {
+        let tau = if tau == 0 { DEFAULT_TAU } else { tau };
+        assert!(
+            D * tau <= EMBED_BUFFER_CAPACITY,
+            "TimeDelayEmbedder D * tau ({}) exceeds EMBED_BUFFER_CAPACITY ({})",
+            D * tau,
+            EMBED_BUFFER_CAPACITY
+        );


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Embed readiness threshold is too strict for tau > 1.

Line 121 uses D * tau, but the embedding definition in Line 74 only needs samples through x(t-(D-1)tau), i.e. (D - 1) * tau + 1. This delays valid embeddings and over-restricts constructor validation.

Proposed fix

- assert!( - D * tau <= EMBED_BUFFER_CAPACITY, - "TimeDelayEmbedder D * tau ({}) exceeds EMBED_BUFFER_CAPACITY ({})", - D * tau, - EMBED_BUFFER_CAPACITY - ); + let required = if D == 0 { + 0 + } else { + (D - 1).saturating_mul(tau).saturating_add(1) + }; + assert!( + required <= EMBED_BUFFER_CAPACITY, + "TimeDelayEmbedder required samples ({}) exceed EMBED_BUFFER_CAPACITY ({})", + required, + EMBED_BUFFER_CAPACITY + ); @@ - let required = D * self.tau; + let required = if D == 0 { + 0 + } else { + (D - 1).saturating_mul(self.tau).saturating_add(1) + }; if self.buffer_len < required { return None; }

Also applies to: 121-123

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/adaptive-topology/src/manifold.rs` around lines 91 - 100, The assertion in the `new` method uses `D * tau` as the validation threshold, but the actual embedding definition only requires `(D - 1) * tau + 1` samples. Update the assertion logic to use the correct formula `(D - 1) * tau + 1` instead of `D * tau`, and also apply the same correction to the embed readiness threshold check around lines 121-123 to ensure consistency and avoid unnecessarily delaying valid embeddings.

coderabbitai · 2026-06-18T06:11:25Z

+                for (neighbor, is_visited) in visited.iter().enumerate().take(self.point_count) {
+                    if !is_visited
+                        && self.are_neighbors(current, neighbor)
+                        && stack_top < MAX_GRAPH_POINTS
+                    {
+                        stack[stack_top] = neighbor;
+                        stack_top += 1;
+                    }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

compute_betti_0 can overcount components due duplicate stack pushes.

At Line 239, nodes are treated as unvisited until pop-time, so the same neighbor can be pushed repeatedly. Under dense connectivity this can fill the fixed stack and skip first-time reachable nodes, producing incorrect β₀.

Proposed fix

- let mut stack = [0usize; MAX_GRAPH_POINTS]; - let mut stack_top = 1; - stack[0] = start; + let mut stack = [0usize; MAX_GRAPH_POINTS]; + let mut stack_top = 1; + stack[0] = start; + visited[start] = true; while stack_top > 0 { stack_top -= 1; let current = stack[stack_top]; - - if visited[current] { - continue; - } - visited[current] = true; - - for (neighbor, is_visited) in visited.iter().enumerate().take(self.point_count) { - if !is_visited - && self.are_neighbors(current, neighbor) - && stack_top < MAX_GRAPH_POINTS - { - stack[stack_top] = neighbor; - stack_top += 1; - } + for neighbor in 0..self.point_count { + if !visited[neighbor] && self.are_neighbors(current, neighbor) { + visited[neighbor] = true; + stack[stack_top] = neighbor; + stack_top += 1; + } } } }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/adaptive-topology/src/manifold.rs` around lines 239 - 246, The issue in the compute_betti_0 function is that nodes are not marked as visited when pushed onto the stack, only when popped, causing the same neighbor to be pushed multiple times. Under dense connectivity, this wastes the fixed-size stack space and can cause the stack to overflow, skipping reachable nodes and producing incorrect Betti number calculations. Fix this by marking a neighbor as visited immediately when it is pushed onto the stack in the section where stack[stack_top] = neighbor is executed, rather than deferring the visited marking until pop-time, to prevent duplicate pushes of the same neighbor.

Signed-off-by: teerth sharma <teerths57@gmail.com>

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/adaptive/tests/unit/storage_tests.rs (1)

243-253: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Assert the new converged field in the stability round-trip test.

Line 244 loads the full stability record, but the test never checks converged. Add an explicit assertion so storage regressions on this field are caught.

Suggested diff

     assert_eq!(loaded_stability.stable_prefix_length, 1);
     assert_eq!(loaded_stability.total_observations, 3);
+    assert!(!loaded_stability.converged);

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/adaptive/tests/unit/storage_tests.rs` around lines 243 - 253, The
stability round-trip test loads the full stability record into the
loaded_stability variable but does not assert the converged field, which could
allow storage regressions on this field to go undetected. Add an explicit
assertion immediately after the existing stability assertions (after the
assertion for loaded_stability.total_observations) to verify the converged field
has the expected value, ensuring the field is properly persisted and retrieved
during the round-trip.

♻️ Duplicate comments (2)

crates/adaptive-topology/src/manifold.rs (2)

91-100: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the mathematically correct embedding readiness threshold.

Line 91 and Line 121 use D * tau, but the embedding in Line 74 ([x(t), x(t-tau), ..., x(t-(D-1)tau)]) only requires (D - 1) * tau + 1 samples. Current logic rejects valid embeddings for tau > 1.

Proposed fix

     pub fn new(tau: usize) -> Self {
         let tau = if tau == 0 { DEFAULT_TAU } else { tau };
+        let required = if D == 0 {
+            0
+        } else {
+            (D - 1).saturating_mul(tau).saturating_add(1)
+        };
         assert!(
-            D * tau <= EMBED_BUFFER_CAPACITY,
-            "TimeDelayEmbedder D * tau ({}) exceeds EMBED_BUFFER_CAPACITY ({})",
-            D * tau,
+            required <= EMBED_BUFFER_CAPACITY,
+            "TimeDelayEmbedder required samples ({}) exceed EMBED_BUFFER_CAPACITY ({})",
+            required,
             EMBED_BUFFER_CAPACITY
         );
@@
     pub fn embed(&self) -> Option<ManifoldPoint<D>> {
-        let required = D * self.tau;
+        let required = if D == 0 {
+            0
+        } else {
+            (D - 1).saturating_mul(self.tau).saturating_add(1)
+        };
         if self.buffer_len < required {
             return None;
         }

Also applies to: 121-123

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/adaptive-topology/src/manifold.rs` around lines 91 - 100, The
embedding readiness threshold is mathematically incorrect. The embedding pattern
requires (D - 1) * tau + 1 samples, not D * tau. Update the assertion in the new
method (where tau is validated) to use (D - 1) * tau + 1 instead of D * tau.
Also apply the same fix to line 121-123 which has the same incorrect threshold
check. This correction will allow valid embeddings to proceed instead of
rejecting them unnecessarily when tau is greater than 1.

225-246: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Mark vertices visited at push-time in DFS to avoid duplicate stack entries.

In Line 239-Line 246, neighbors are marked visited only after pop. That allows duplicate pushes for the same vertex, which can consume the fixed stack and skip reachable vertices, producing incorrect β₀.

Proposed fix

             components += 1;
             let mut stack = [0usize; MAX_GRAPH_POINTS];
             let mut stack_top = 1;
             stack[0] = start;
+            visited[start] = true;

             while stack_top > 0 {
                 stack_top -= 1;
                 let current = stack[stack_top];

-                if visited[current] {
-                    continue;
-                }
-                visited[current] = true;
-
-                for (neighbor, is_visited) in visited.iter().enumerate().take(self.point_count) {
-                    if !is_visited
-                        && self.are_neighbors(current, neighbor)
-                        && stack_top < MAX_GRAPH_POINTS
-                    {
+                for neighbor in 0..self.point_count {
+                    if !visited[neighbor] && self.are_neighbors(current, neighbor) {
+                        visited[neighbor] = true;
                         stack[stack_top] = neighbor;
                         stack_top += 1;
                     }
                 }
             }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/adaptive-topology/src/manifold.rs` around lines 225 - 246, The DFS
traversal marks vertices as visited only after they are popped from the stack
(when visited[current] is set to true), which allows the same vertex to be
pushed onto the stack multiple times before it gets processed. This causes
duplicate stack entries that can overflow the fixed-size stack and skip
reachable vertices, resulting in incorrect component counting. Move the visited
marking to happen when pushing neighbors onto the stack rather than when popping
them. In the neighbor iteration loop where neighbors are added to the stack
(around lines 239-246), set visited[neighbor] = true immediately before pushing
the neighbor onto the stack to prevent duplicate pushes of the same vertex.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/adaptive-topology/README.md`:
- Around line 33-35: The code example block in the README.md file starts
immediately after the "## Example" heading without a complete-sentence
introduction. Add a descriptive sentence between the heading and the opening
triple backticks that explains what the code example demonstrates, following the
documentation style guideline that requires every code block to be introduced
with a complete sentence.

In `@crates/adaptive-topology/src/convergence.rs`:
- Around line 237-239: In the betti_score calculation within the
convergence_score method, the variations count is being normalized by
self.stability_window when it should be normalized by the actual number of
transitions. Since variations are counted from window[..count].windows(2), which
produces count - 1 adjacent pairs, change the denominator in the betti_score
assignment from self.stability_window to (count - 1) to correctly normalize the
variation ratio.

In `@crates/adaptive-topology/src/drift.rs`:
- Around line 40-60: Add a validation check at the beginning of the update
method to ensure all values in the centroid array are finite (not NaN or Inf).
If any non-finite value is detected, return early with a drift value of 0.0 to
prevent non-finite values from being written into the internal state fields
(previous, expected, velocity). This guards against NaN values persisting in the
drift and velocity calculations, which would disable meaningful drift checks.

In `@crates/adaptive/tests/unit/adaptive_hints_intercept_tests.rs`:
- Around line 361-362: The current assertion only verifies that the agent hints
header is absent from the request, but it does not verify that the agent hints
data is also absent from the request body. To harden the test, add a negative
assertion that checks the request body does not contain the nvext.agent_hints
field after the existing header check on line 361. This ensures that if a
regression accidentally injects nvext.agent_hints in the body while skipping the
header, the test will catch it and fail as expected.

In `@crates/adaptive/tests/unit/runtime_tests.rs`:
- Around line 322-375: The test
validate_config_reports_invalid_topology_numeric_fields should verify
field-level diagnostics for all invalid fields within components, not just
component-level diagnostics. Currently it checks for component paths like
"adaptive_hints.governor" and "acg.convergence", but it should also assert that
diagnostics exist for specific invalid fields within those components. For the
acg.convergence and convergence components which each have multiple invalid
fields (epsilon and stability_window), add additional assertions to the
report.diagnostics iteration to verify that field-level diagnostics are reported
for paths like "acg.convergence.epsilon", "acg.convergence.stability_window",
"convergence.epsilon", and "convergence.stability_window" to ensure all invalid
numeric fields are being validated and prevent silent validator regressions.

In `@docs/adaptive-plugin/about.mdx`:
- Around line 39-40: The documentation uses inconsistent terminology for the
same concept: "topology-aware" appears in one location while "topology-inspired"
appears in another location (in the line mentioning "topology-inspired
signals"). Standardize on a single term throughout the document by identifying
all instances of both "topology-aware" and "topology-inspired" and replacing
them with one consistent term. Ensure the chosen term is applied uniformly
across the entire file to maintain clarity and avoid ambiguity for readers.

In `@docs/adaptive-plugin/configuration.mdx`:
- Around line 221-233: The Rust example for tool_parallelism configuration uses
ToolParallelismComponentConfig::default() implicitly, while the TOML, Python,
and Node examples all explicitly demonstrate the drift configuration. Update the
line setting adaptive.tool_parallelism to explicitly define the drift field
configuration instead of relying on the default() method, ensuring the Rust
example maintains parity with the other language examples in the documentation.

---

Outside diff comments:
In `@crates/adaptive/tests/unit/storage_tests.rs`:
- Around line 243-253: The stability round-trip test loads the full stability
record into the loaded_stability variable but does not assert the converged
field, which could allow storage regressions on this field to go undetected. Add
an explicit assertion immediately after the existing stability assertions (after
the assertion for loaded_stability.total_observations) to verify the converged
field has the expected value, ensuring the field is properly persisted and
retrieved during the round-trip.

---

Duplicate comments:
In `@crates/adaptive-topology/src/manifold.rs`:
- Around line 91-100: The embedding readiness threshold is mathematically
incorrect. The embedding pattern requires (D - 1) * tau + 1 samples, not D *
tau. Update the assertion in the new method (where tau is validated) to use (D -
1) * tau + 1 instead of D * tau. Also apply the same fix to line 121-123 which
has the same incorrect threshold check. This correction will allow valid
embeddings to proceed instead of rejecting them unnecessarily when tau is
greater than 1.
- Around line 225-246: The DFS traversal marks vertices as visited only after
they are popped from the stack (when visited[current] is set to true), which
allows the same vertex to be pushed onto the stack multiple times before it gets
processed. This causes duplicate stack entries that can overflow the fixed-size
stack and skip reachable vertices, resulting in incorrect component counting.
Move the visited marking to happen when pushing neighbors onto the stack rather
than when popping them. In the neighbor iteration loop where neighbors are added
to the stack (around lines 239-246), set visited[neighbor] = true immediately
before pushing the neighbor onto the stack to prevent duplicate pushes of the
same vertex.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: ccb1a467-92d2-498a-ae76-c078c7204100

📥 Commits

Reviewing files that changed from the base of the PR and between da2470a and 24ed6f2.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (66)

.pre-commit-config.yaml
ATTRIBUTIONS-Rust.md
Cargo.toml
crates/adaptive-topology/Cargo.toml
crates/adaptive-topology/README.md
crates/adaptive-topology/src/convergence.rs
crates/adaptive-topology/src/drift.rs
crates/adaptive-topology/src/geometry.rs
crates/adaptive-topology/src/governor.rs
crates/adaptive-topology/src/lib.rs
crates/adaptive-topology/src/manifold.rs
crates/adaptive-topology/src/topology.rs
crates/adaptive/Cargo.toml
crates/adaptive/README.md
crates/adaptive/benches/convergence_bench.rs
crates/adaptive/src/acg/stability.rs
crates/adaptive/src/acg_learner.rs
crates/adaptive/src/adaptive_hints_intercept.rs
crates/adaptive/src/config.rs
crates/adaptive/src/lib.rs
crates/adaptive/src/plugin_component.rs
crates/adaptive/src/runtime/features.rs
crates/adaptive/src/runtime/validation.rs
crates/adaptive/src/tool_parallelism_learner.rs
crates/adaptive/tests/integration/runtime_integration_tests.rs
crates/adaptive/tests/integration/tool_parallelism_plan_tests.rs
crates/adaptive/tests/integration/topology_convergence_tests.rs
crates/adaptive/tests/unit/acg/economics_internal_tests.rs
crates/adaptive/tests/unit/acg/economics_policy_tests.rs
crates/adaptive/tests/unit/acg/multi_breakpoint_tests.rs
crates/adaptive/tests/unit/acg_component_tests.rs
crates/adaptive/tests/unit/adaptive_hints_intercept_tests.rs
crates/adaptive/tests/unit/cache_diagnostics_tests.rs
crates/adaptive/tests/unit/config_tests.rs
crates/adaptive/tests/unit/intercepts_tests.rs
crates/adaptive/tests/unit/plugin_component_tests.rs
crates/adaptive/tests/unit/runtime_features_tests.rs
crates/adaptive/tests/unit/runtime_tests.rs
crates/adaptive/tests/unit/storage_memory_internal_tests.rs
crates/adaptive/tests/unit/storage_tests.rs
crates/adaptive/tests/unit/tool_parallelism_learner_tests.rs
crates/adaptive/tests/unit/types_tests.rs
crates/node/adaptive.d.ts
crates/node/adaptive.js
crates/node/tests/adaptive_tests.mjs
crates/python/Cargo.toml
crates/python/src/lib.rs
crates/python/src/py_adaptive_topology.rs
crates/python/tests/coverage/py_storage_coverage_tests.rs
docs/adaptive-plugin/about.mdx
docs/adaptive-plugin/acg.mdx
docs/adaptive-plugin/adaptive-hints.mdx
docs/adaptive-plugin/configuration.mdx
go/nemo_relay/adaptive.go
go/nemo_relay/adaptive/adaptive.go
go/nemo_relay/adaptive_test.go
python/nemo_relay/__init__.py
python/nemo_relay/__init__.pyi
python/nemo_relay/_native.pyi
python/nemo_relay/adaptive.py
python/nemo_relay/adaptive.pyi
python/nemo_relay/adaptive_topology.py
python/nemo_relay/adaptive_topology.pyi
python/tests/test_adaptive.py
python/tests/test_adaptive_config.py
python/tests/test_adaptive_topology.py

coderabbitai · 2026-06-18T06:53:10Z

+## Example
+
+```rust


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a complete-sentence lead-in before the example code block.

Line [35] starts a fenced block immediately after the heading, which violates the doc style requirement for code-block introductions.

Suggested fix

## Example +The following example shows basic usage of the governor, drift detector, and convergence detector. + ```rust use nemo_relay_adaptive_topology::{GeometricGovernor, DriftDetector, ConvergenceDetector, BettiNumbers};

As per coding guidelines, "Introduce every code block with a complete sentence."

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/adaptive-topology/README.md` around lines 33 - 35, The code example block in the README.md file starts immediately after the "## Example" heading without a complete-sentence introduction. Add a descriptive sentence between the heading and the opening triple backticks that explains what the code example demonstrates, following the documentation style guideline that requires every code block to be introduced with a complete sentence.

Source: Coding guidelines

coderabbitai · 2026-06-18T06:53:10Z

+            let variations = window[..count].windows(2).filter(|w| w[0] != w[1]).count();
+            let betti_score = 1.0 - (variations as f64 / self.stability_window as f64);
+            score += BETTI_STABILITY_WEIGHT * betti_score;


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Normalize Betti variation by transition count, not window length.

Line [238] divides by self.stability_window, but variations are counted across adjacent pairs (count - 1 transitions). This systematically overstates Betti stability in convergence_score.

Suggested fix

let variations = window[..count].windows(2).filter(|w| w[0] != w[1]).count(); -let betti_score = 1.0 - (variations as f64 / self.stability_window as f64); +let transitions = count.saturating_sub(1).max(1); +let betti_score = 1.0 - (variations as f64 / transitions as f64); score += BETTI_STABILITY_WEIGHT * betti_score;

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/adaptive-topology/src/convergence.rs` around lines 237 - 239, In the betti_score calculation within the convergence_score method, the variations count is being normalized by self.stability_window when it should be normalized by the actual number of transitions. Since variations are counted from window[..count].windows(2), which produces count - 1 adjacent pairs, change the denominator in the betti_score assignment from self.stability_window to (count - 1) to correctly normalize the variation ratio.

coderabbitai · 2026-06-18T06:53:10Z

+    pub fn update(&mut self, centroid: &[f64; D]) -> f64 {
+        let drift = if self.has_previous {
+            l2_distance(&self.expected, centroid)
+        } else {
+            0.0
+        };
+
+        if self.has_previous {
+            for (d, vel) in self.velocity.iter_mut().enumerate().take(D) {
+                *vel = centroid[d] - self.previous[d];
+            }
+        }
+
+        for (d, exp) in self.expected.iter_mut().enumerate().take(D) {
+            *exp = centroid[d] + self.velocity[d];
+        }
+
+        self.previous = *centroid;
+        self.has_previous = true;
+
+        drift


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard update against non-finite centroids to prevent persistent NaN state.

At Line [40], non-finite centroid values are accepted and then written into previous/expected/velocity. After that, drift and velocity calculations can stay NaN and effectively disable meaningful drift checks.

Suggested fix

pub fn update(&mut self, centroid: &[f64; D]) -> f64 { + if centroid.iter().any(|coord| !coord.is_finite()) { + self.reset(); + return f64::INFINITY; + } + let drift = if self.has_previous { l2_distance(&self.expected, centroid) } else { 0.0 };

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/adaptive-topology/src/drift.rs` around lines 40 - 60, Add a validation check at the beginning of the update method to ensure all values in the centroid array are finite (not NaN or Inf). If any non-finite value is detected, return early with a drift value of 0.0 to prevent non-finite values from being written into the internal state fields (previous, expected, velocity). This guards against NaN values persisting in the drift and velocity calculations, which would disable meaningful drift checks.

coderabbitai · 2026-06-18T06:53:10Z

+    assert!(request.headers.get(AGENT_HINTS_HEADER_KEY).is_none());
+


🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Assert body omission when the governor sheds hints.

Line 361 only checks header absence. If a regression injects nvext.agent_hints in the body but skips headers, this still passes. Add a negative assertion on the body path in the first request branch.

Proposed test hardening

.unwrap(); assert!(request.headers.get(AGENT_HINTS_HEADER_KEY).is_none()); + assert!( + request + .content + .get("nvext") + .and_then(|nvext| nvext.get("agent_hints")) + .is_none() + );

As per coding guidelines, tests should cover behavior promised by the changed API surface.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/adaptive/tests/unit/adaptive_hints_intercept_tests.rs` around lines 361 - 362, The current assertion only verifies that the agent hints header is absent from the request, but it does not verify that the agent hints data is also absent from the request body. To harden the test, add a negative assertion that checks the request body does not contain the nvext.agent_hints field after the existing header check on line 361. This ensures that if a regression accidentally injects nvext.agent_hints in the body while skipping the header, the test will catch it and fail as expected.

Source: Coding guidelines

coderabbitai · 2026-06-18T06:53:10Z

+#[test]
+fn validate_config_reports_invalid_topology_numeric_fields() {
+    let report = validate_config(&AdaptiveConfig {
+        adaptive_hints: Some(AdaptiveHintsComponentConfig {
+            governor: Some(GovernorConfig {
+                enabled: true,
+                epsilon: f64::NAN,
+            }),
+            ..AdaptiveHintsComponentConfig::default()
+        }),
+        tool_parallelism: Some(ToolParallelismComponentConfig {
+            drift: Some(DriftConfig {
+                enabled: true,
+                threshold: 0.0,
+            }),
+            ..ToolParallelismComponentConfig::default()
+        }),
+        acg: Some(AcgComponentConfig {
+            convergence: Some(ConvergenceConfig {
+                enabled: true,
+                epsilon: -1.0,
+                stability_window: 2,
+            }),
+            ..AcgComponentConfig::default()
+        }),
+        convergence: Some(ConvergenceConfig {
+            enabled: true,
+            epsilon: f64::INFINITY,
+            stability_window: 0,
+        }),
+        policy: ConfigPolicy {
+            unsupported_value: UnsupportedBehavior::Error,
+            ..ConfigPolicy::default()
+        },
+        ..AdaptiveConfig::default()
+    });
+
+    assert!(report.has_errors());
+    for component in [
+        "adaptive_hints.governor",
+        "tool_parallelism.drift",
+        "acg.convergence",
+        "convergence",
+    ] {
+        assert!(
+            report
+                .diagnostics
+                .iter()
+                .any(|diag| diag.code == "adaptive.unsupported_value"
+                    && diag.component.as_deref() == Some(component)),
+            "expected unsupported value diagnostic for {component}"
+        );
+    }
+}


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert field-level diagnostics for invalid topology numeric inputs.

This test can still pass if only one invalid field per component is validated. For acg.convergence and top-level convergence, assert both epsilon and stability_window diagnostics to prevent silent validator regressions.

Suggested test tightening

- for component in [ - "adaptive_hints.governor", - "tool_parallelism.drift", - "acg.convergence", - "convergence", - ] { + for (component, field) in [ + ("adaptive_hints.governor", "epsilon"), + ("tool_parallelism.drift", "threshold"), + ("acg.convergence", "epsilon"), + ("acg.convergence", "stability_window"), + ("convergence", "epsilon"), + ("convergence", "stability_window"), + ] { assert!( report .diagnostics .iter() .any(|diag| diag.code == "adaptive.unsupported_value" - && diag.component.as_deref() == Some(component)), - "expected unsupported value diagnostic for {component}" + && diag.component.as_deref() == Some(component) + && diag.field.as_deref() == Some(field)), + "expected unsupported value diagnostic for {component}.{field}" ); }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/adaptive/tests/unit/runtime_tests.rs` around lines 322 - 375, The test validate_config_reports_invalid_topology_numeric_fields should verify field-level diagnostics for all invalid fields within components, not just component-level diagnostics. Currently it checks for component paths like "adaptive_hints.governor" and "acg.convergence", but it should also assert that diagnostics exist for specific invalid fields within those components. For the acg.convergence and convergence components which each have multiple invalid fields (epsilon and stability_window), add additional assertions to the report.diagnostics iteration to verify that field-level diagnostics are reported for paths like "acg.convergence.epsilon", "acg.convergence.stability_window", "convergence.epsilon", and "convergence.stability_window" to ensure all invalid numeric fields are being validated and prevent silent validator regressions.

coderabbitai · 2026-06-18T06:53:10Z

 adaptive.tool_parallelism = Some(ToolParallelismComponentConfig::default());
 adaptive.adaptive_hints = Some(AdaptiveHintsComponentConfig {
    inject_body_path: "nvext.agent_hints".into(),
+    governor: Some(GovernorConfig::default()),
    ..AdaptiveHintsComponentConfig::default()
 });
 adaptive.acg = Some(AcgComponentConfig {
    provider: "passthrough".into(),
+    convergence: Some(ConvergenceConfig::default()),
    ..AcgComponentConfig::default()
 });
+adaptive.convergence = Some(ConvergenceConfig::default());



🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Add tool_parallelism.drift to the Rust example for cross-language parity.

The TOML, Python, and Node examples all demonstrate drift config, but the Rust example leaves it implicit via ToolParallelismComponentConfig::default(). Showing the explicit field keeps usage guidance aligned across bindings.

Suggested docs update

use nemo_relay_adaptive::{ AcgComponentConfig, AdaptiveConfig, AdaptiveHintsComponentConfig, BackendSpec, ConvergenceConfig, + DriftConfig, GovernorConfig, StateConfig, TelemetryComponentConfig, ToolParallelismComponentConfig, }; @@ -adaptive.tool_parallelism = Some(ToolParallelismComponentConfig::default()); +adaptive.tool_parallelism = Some(ToolParallelismComponentConfig { + mode: "observe_only".into(), + drift: Some(DriftConfig::default()), + ..ToolParallelismComponentConfig::default() +});

As per coding guidelines: docs/**/*.{md,mdx} should update embedded documentation snippets when examples or supported behavior change.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/adaptive-plugin/configuration.mdx` around lines 221 - 233, The Rust example for tool_parallelism configuration uses ToolParallelismComponentConfig::default() implicitly, while the TOML, Python, and Node examples all explicitly demonstrate the drift configuration. Update the line setting adaptive.tool_parallelism to explicitly define the drift field configuration instead of relying on the default() method, ensuring the Rust example maintains parity with the other language examples in the documentation.

Source: Coding guidelines

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Teerth Sharma <teerths57@gmail.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/adaptive-plugin/about.mdx`:
- Around line 39-40: The file docs/adaptive-plugin/about.mdx is missing the
required SPDX license header at the top. Add the SPDX header using JSX comment
delimiters {/* ... */} at the very beginning of the file before the YAML
frontmatter. The header should include the SPDX-FileCopyrightText line with the
copyright notice and the SPDX-License-Identifier line set to Apache-2.0, wrapped
in the JSX comment syntax as specified in the review comment.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 289cbccd-452e-4a63-9da2-9d593edfa1a4

📥 Commits

Reviewing files that changed from the base of the PR and between 24ed6f2 and 6fc3d5e.

📒 Files selected for processing (1)

docs/adaptive-plugin/about.mdx

📜 Review details

🧰 Additional context used

📓 Path-based instructions (12)

{docs/**,README.md,CONTRIBUTING.md}