test(agent-harness): restore TAURI-RUST-4 dedup seam test by sanil-23 · Pull Request #2665 · tinyhumansai/openhuman

sanil-23 · 2026-05-26T06:44:13Z

Summary

Restore the TAURI-RUST-4 regression test run_tool_call_loop_dedups_duplicate_tool_names_before_provider_call and its CapturingProvider test helper in src/openhuman/agent/harness/tool_loop_tests.rs.
Both were inadvertently dropped when the tool-loop test module was rewritten in feat(agent): agentic coding runtime — gated OS capabilities (filesystem, shell, install) via deterministic permission tiers + chat approvals #2631; the dedup logic in tool_loop.rs is currently untested.

Problem

#2631 rewrote tool_loop_tests.rs and, in the process, removed the seam test that guards TAURI-RUST-4 along with the CapturingProvider helper it depends on. CapturingProvider records the tool-spec names that reach ChatRequest.tools; the surviving ScriptedProvider discards the request, so it cannot observe deduplication. The dedup logic itself still lives in tool_loop.rs, but with the test gone a future change could silently let duplicate tool names reach the provider — which some providers (Anthropic, OpenHuman cloud after the uniqueness rollout) reject with a 400.

Solution

Restore CapturingProvider (struct + Provider impl) and the #[tokio::test] verbatim from the pre-feat(agent): agentic coding runtime — gated OS capabilities (filesystem, shell, install) via deterministic permission tiers + chat approvals #2631 tree (0a9f7a0e~1).
The test drives run_tool_call_loop with a registry tool named echo plus an extra_tools collision also named echo (the exact shadowing pattern from the bug report) and asserts exactly one echo reaches ChatRequest.tools.
Confirmed compatible with current main: the 17-arg run_tool_call_loop signature is unchanged, so the restored test exercises live behavior. cargo test --lib …dedups_duplicate_tool_names… → 1 passed; 0 failed.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

Tests added or updated (happy path + at least one failure / edge case) — this PR is the restored regression test; it asserts the dedup happy path (collision collapses to one echo) against the live loop.
Diff coverage ≥ 80% — N/A: the only changed lines are test code, which executes itself when run (cargo test green); no production lines added.
Coverage matrix updated — N/A: test-only restore, no feature added/removed/renamed.
All affected feature IDs from the matrix are listed under ## Related — N/A: no feature change.
No new external network dependencies introduced — none; the test uses an in-memory fake provider.
Manual smoke checklist updated — N/A: test-only change, no release-cut surface touched.
Linked issue closed via Closes #NNN — N/A: tracked privately on my fork; intentionally not linked.

Impact

Runtime/platform impact: none — test-only; no production code changes.
Restores regression protection for the duplicate-tool-name dedup path (TAURI-RUST-4).

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Key: N/A
URL: N/A

Commit & Branch

Branch: fix/restore-tool-dedup-seam-test
Commit SHA: c5282d03

Validation Run

pnpm --filter openhuman-app format:check — N/A: no app/ TS changes. (cargo fmt --check on the changed file is clean.)
pnpm typecheck — N/A: no TypeScript changes.
Focused tests: cargo test --lib run_tool_call_loop_dedups_duplicate_tool_names_before_provider_call → 1 passed; 0 failed.
Rust fmt/check (if changed): cargo fmt --check clean; lib crate compiles (test built + ran green).
Tauri fmt/check (if changed): N/A — no Tauri shell changes.

Validation Blocked

command: pnpm rust:check (pre-push hook)
error: Node/pnpm + vendored tauri-cef toolchain unavailable in this environment.
impact: Pushed with --no-verify; the Rust core change was verified directly via cargo test. CI re-runs the full gate.

Behavior Changes

Intended behavior change: none.
User-visible effect: none.

Parity Contract

Legacy behavior preserved: yes — restores previously-existing test coverage only.
Guard/fallback/dispatch parity checks: dedup guard in tool_loop.rs is unchanged; the test re-asserts it.

Duplicate / Superseded PR Handling

Duplicate PR(s): none
Canonical PR: this
Resolution: N/A

Summary by CodeRabbit

Release Notes

Tests
- Added integration test verifying that duplicate tool names are properly deduplicated during tool call processing, ensuring correct handling of overlapping tool definitions.

`run_tool_call_loop_dedups_duplicate_tool_names_before_provider_call` and its `CapturingProvider` helper (which records the tool-spec names reaching the provider) were dropped when tool_loop_tests.rs was rewritten in tinyhumansai#2631. They guard TAURI-RUST-4: duplicate tool names (e.g. a synthesised delegation tool whose delegate_name shadows a same-named skill tool) must be deduplicated before the provider call, since some providers 400 on duplicate tool names. Restore both verbatim. The current 17-arg `run_tool_call_loop` signature and the dedup logic in tool_loop.rs are unchanged, so the test compiles and passes as a genuine regression guard. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-26T06:44:28Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3ee9d348-ddab-411a-acc7-21a52057e8c2

📥 Commits

Reviewing files that changed from the base of the PR and between e7e7e8a and c5282d0.

📒 Files selected for processing (1)

src/openhuman/agent/harness/tool_loop_tests.rs

📝 Walkthrough

Walkthrough

This PR adds test coverage for tool name deduplication in the tool call loop. A CapturingProvider test double records tool names from provider calls, and a new test verifies that when multiple registries contain tools with identical names, only one deduplicated name is sent to the provider.

Changes

Tool Deduplication Regression Test

Layer / File(s)	Summary
Tool deduplication test infrastructure and verification `src/openhuman/agent/harness/tool_loop_tests.rs`	`CapturingProvider` test double captures tool-spec names from each `ChatRequest.tools` invocation and returns scripted responses. New test `run_tool_call_loop_dedups_duplicate_tool_names_before_provider_call` runs the tool loop with `native_tools` enabled using a registry and extra tools both containing `EchoTool`, asserts exactly one provider `chat()` call occurs, and verifies only a single `"echo"` entry is sent (duplicates removed before the provider call).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

tinyhumansai/openhuman#2446: The main PR's new regression test directly targets the same run_tool_call_loop/provider-call deduplication behavior introduced in #2446 by ensuring only one "echo" tool spec is sent in ChatRequest.tools.
tinyhumansai/openhuman#2485: Both PRs implement/verify deduplication of duplicate tool names/specs before sending them in Provider::chat() requests (main PR via run_tool_call_loop test, retrieved PR in subagent runner via filtered_specs dedup).

Suggested labels

agent, working

Poem

A test hops in, so keen and bright, 🐰
To catch the dupes before they flight,
One echo tool where two once hid,
The loop dedupes—hooray, it did!
Clean tools sent forth, the test confirmed aright!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: restoring a regression test for tool name deduplication in the agent harness, with specific reference to the TAURI-RUST-4 issue.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

M3gA-Mind

Straightforward test restoration. Everything looks correct.

CapturingProvider — records ChatRequest.tools names per chat() call. Uses parking_lot::Mutex consistently with the rest of the file (line 9). .lock() without unwrap() is correct for parking_lot. The guard.remove(0) call would panic if called more times than scripted responses exist, but the test is designed for exactly one chat() invocation (final-text response, no tool calls), so this is fine.

Test logic — sets up a registry + extra_tools collision with two EchoTool instances both named "echo", asserts exactly one "echo" in ChatRequest.tools. Directly exercises the dedup path and the right failure mode (duplicate names reaching the provider → 400 from Anthropic/cloud).

Signature check — the 17-arg run_tool_call_loop call matches what the other tests in this file use (e.g. lines 212, 264, 303), no drift introduced.

CI green, no conflicts. LGTM.

sanil-23 requested a review from a team May 26, 2026 06:44

coderabbitai Bot added agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. working A PR that is being worked on by the team. labels May 26, 2026

coderabbitai Bot approved these changes May 26, 2026

View reviewed changes

M3gA-Mind approved these changes May 26, 2026

View reviewed changes

M3gA-Mind merged commit 1aee7de into tinyhumansai:main May 26, 2026
42 of 48 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(agent-harness): restore TAURI-RUST-4 dedup seam test#2665

test(agent-harness): restore TAURI-RUST-4 dedup seam test#2665
M3gA-Mind merged 1 commit into
tinyhumansai:mainfrom
sanil-23:fix/restore-tool-dedup-seam-test

sanil-23 commented May 26, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 26, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

M3gA-Mind left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sanil-23 commented May 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Submission Checklist

Impact

Related

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Commit & Branch

Validation Run

Validation Blocked

Behavior Changes

Parity Contract

Duplicate / Superseded PR Handling

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

M3gA-Mind left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sanil-23 commented May 26, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 26, 2026 •

edited

Loading