fix(agent-harness): dedup visible tool specs in all provider-call paths#2446
Conversation
…e level and re-export for sibling modules - Updated the visibility of from to to allow access within the crate. - Re-exported in for use in sibling harness modules, ensuring a shared implementation across provider call sites.
…ider - Introduced a filtering step to deduplicate tool specifications by name, addressing potential collisions between registry tools and per-turn synthesized extra tools. - Updated the tool specification collection process to ensure only visible tools are included, improving compatibility with providers that enforce uniqueness on tool names.
…cation - Added a logging mechanism to warn when duplicate tool specifications are dropped before making a provider call. - Enhanced the deduplication process to ensure that only unique tool specs are sent, addressing potential issues with providers that enforce name uniqueness.
…ool names - Implemented a new test to ensure that duplicate tool names are correctly deduplicated before being sent to the provider, addressing the issue where providers reject requests with non-unique tool names. - Introduced a to record tool specifications and validate that only unique names are passed during the call.
|
Caution Review failedFailed to post review comments 📝 WalkthroughWalkthroughWiden and re-export dedup_visible_tool_specs, then apply it where harness code assembles visible ToolSpecs so duplicate tool names are removed (first occurrence kept) before provider calls; add a regression test capturing provider-observed tool names. ChangesTool spec deduplication across harness provider calls
German i18n additions
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/agent/harness/subagent_runner/ops.rs`:
- Around line 841-855: The dedup currently keeps the first occurrence in
filtered_specs which preserves parent tool specs over extra_tools, but execution
prefers extra_tools; update the code so dedup_visible_tool_specs is given the
tool list in execution-precedence order (extra_tools before parent_tools) or
change dedup_visible_tool_specs to prefer later entries (i.e., keep the last
occurrence) so that extra_tools win on duplicate names; locate the use of
filtered_specs and the call to
crate::openhuman::agent::harness::session::dedup_visible_tool_specs and ensure
the order or dedup logic matches the runtime execution order referenced by
extra_tools and parent_tools.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9358788c-1649-46da-9927-c628b74fe055
📒 Files selected for processing (5)
src/openhuman/agent/harness/session/builder.rssrc/openhuman/agent/harness/session/mod.rssrc/openhuman/agent/harness/subagent_runner/ops.rssrc/openhuman/agent/harness/tool_loop.rssrc/openhuman/agent/harness/tool_loop_tests.rs
…execution precedence - Adjusted the order of tool specifications to prioritize dynamic tools before parent specs, ensuring consistency between the schema seen by the model and the tools that execute. - Enhanced the deduplication process to maintain the execution order, addressing potential issues with duplicate tool names during provider calls.
graycyrus
left a comment
There was a problem hiding this comment.
Clean fix for TAURI-RUST-4. Spec-build order matches dispatch order in both paths (tool_loop chains registry→extra, subagent_runner chains dynamic→parent), so dedup's first-occurrence-wins is correct. Shared helper avoids code duplication, regression test covers the collision scenario well. One minor nit inline.
| File | Change |
|---|---|
session/builder.rs |
pub(super) → pub(crate) visibility bump |
session/mod.rs |
Re-export dedup_visible_tool_specs for sibling modules |
tool_loop.rs |
Dedup call before provider chat() |
subagent_runner/ops.rs |
Reorder specs to match execution precedence + dedup + warn |
tool_loop_tests.rs |
CapturingProvider regression test |
…before-provider-call
Summary
ChatRequest.toolsby name in the two remaining provider-call paths (run_tool_call_loopandsubagent_runner::run_inner_loop); theAgent::turn()path was already deduped viarebuild_tool_policy_session()in Prioritize fully local speech and Composer operation #1710.dedup_visible_tool_specsfrompub(super)topub(crate)and re-export it fromagent::harness::sessionso all three call sites share one implementation and one set of tests.run_tool_call_loop_dedups_duplicate_tool_names_before_provider_call) using aCapturingProviderthat recordsChatRequest.toolsnames and asserts duplicates are dropped before the provider sees them.tracing::warn!) insubagent_runnerwhen duplicates are dropped, so future collisions are observable in production logs without re-triggering the Sentry 400.Problem
Several upstream chat providers (notably OpenHuman's native
chat-v1route) reject requests whosetoolsarray contains two entries with the samename, responding with HTTP 400"Tool names must be unique". Sentry issue TAURI-RUST-4 captured 2,011 occurrences of this 400 from production. PR #1710 fixed theAgent::turn()path, but two adjacent paths were left exposed:agent::harness::tool_loop::run_tool_call_loop— used byagent/bus.rs, channels, and CLI flows. Collisions occur when the same tool name appears in bothtools_registryand the per-turnextra_toolslist (e.g. dynamically synthesized variants).agent::harness::subagent_runner::ops::run_inner_loop— used by typed sub-agents. Collisions occur when a parent-registry tool name shadows a dynamically-attached Composio action.Either path could still emit
ChatRequest.toolswith duplicate names, surface the 400, and abort the turn.Solution
dedup_visible_tool_specshelper (already covered by 4 unit tests insession::builder::dedup_tests) topub(crate)and re-export it viasession::modso sibling modules can reuse it without duplication.filter+collectinrun_tool_call_loopwith adedup_visible_tool_specs(...)call. First-occurrence-wins semantics match the runtime tool-dispatch order (tools_registry.iter().chain(extra_tools.iter())), so the spec the model sees and the tool that actually executes stay aligned.run_inner_loop, rundedup_visible_tool_specsoverfiltered_specsand emit atracing::warn!when the length shrinks, includingagent_idfor correlation.allowed_namesis built before dedup and is unaffected — every dropped entry's name is still present from its kept sibling.Vec<ToolSpec>allocation per provider call (rather thanretain-in-place), in exchange for a single shared helper with one set of tests. The allocation is negligible compared to the network call it precedes.Submission Checklist
run_tool_call_loop_dedups_duplicate_tool_names_before_provider_call(failure path: pre-fix would have sent duplicates and 400'd); existingdedup_tests(4 cases) now cover all three call sites via the shared helper.diff-cover) meet the gate enforced by.github/workflows/coverage.yml. Runpnpm test:coverageandpnpm test:rustlocally; PRs below 80% on changed lines will not merge.N/A: behaviour-preserving fix; no new feature rows in [docs/TEST-COVERAGE-MATRIX.md](../docs/TEST-COVERAGE-MATRIX.md).## Related—N/A: no matrix feature IDs touched.CapturingProvider.docs/RELEASE-MANUAL-SMOKE.md) —N/A: no user-visible surface change; behaviour is the absence of an HTTP 400.Impact
openhuman-coreCLI binary, since both share the harness. No frontend/Tauri-shell changes.Vec<ToolSpec>allocation +HashSet<String>walk perchat()call. O(n) over the per-turn tool count (typically < 50). Dwarfed by the provider round-trip it precedes.dedup_visible_tool_specsvisibility widens (pub(super)→pub(crate)) — strictly internal, does not affect the crate's public API or downstream consumers (Tauri shell verified viacargo check --manifest-path app/src-tauri/Cargo.toml).Related
Agent::turn()viarebuild_tool_policy_session)Summary by CodeRabbit