Implement dual retrieval strategies for WHO handler by rvguha · Pull Request #1 · nlweb-ai/AgentFinder

rvguha · 2026-03-13T02:13:33Z

Summary

This PR implements two retrieval strategies (agent-level and query-level) for the WHO handler, with comprehensive testing infrastructure and documentation.

Changes

🚀 New Features

Retrieval Strategy 1: Agent-Level (Default)

One document per agent
Direct retrieval and ranking
Fast (~500ms latency)
Backward compatible

Retrieval Strategy 2: Query-Level (Recommended)

Multiple sample query documents per agent
Semantic query-to-query matching
Better precision and explainability
Shows matched queries in results

📝 API Updates

REST API:

# Agent strategy (default)
GET /who?query=help+me+write&strategy=agent

# Query strategy
GET /who?query=help+me+write&strategy=query

MCP Tool:

{
  "query": {"text": "help me write"},
  "meta": {"strategy": "query"}
}

🧪 Test Results

Unit Tests: All 20 tests pass ✅

4 protocol compliance tests
6 protocol type tests
2 filtering tests
2 error handling tests
2 ranking tests
1 caching test
4 NEW retrieval strategy tests

Test Rig Results: Query strategy wins 10/13 queries

Query Strategy: 10 wins (higher scores)
Agent Strategy: 0 wins
Ties: 3

Average scores when both find correct agent:

Query Strategy: 93.5
Agent Strategy: 84.5

📊 Example Results

Query	Agent Strategy	Query Strategy	Winner
"help me improve my writing"	✓ #1 (85)	✓ #1 (100)	Query
"translate text to Spanish"	✓ #1 (80)	✓ #1 (95)	Query
"plan a trip to Japan"	✓ #1 (85)	✓ #1 (95)	Query
"review my Python code"	✓ #1 (90)	✓ #1 (90)	Tie
"meal planning for the week"	✓ #1 (80)	✓ #1 (100)	Query

Files Changed

Core Implementation

code/who_handler.py - Both retrieval strategies (+200 lines)
code/agent_finder.py - API updates for strategy parameter
code/test_who_handler.py - Comprehensive test suite (NEW)

Test Rig

test_rig/test_retrieval_strategies.py - Comparison script (NEW)
test_rig/mock_backends.py - Mock backends (NEW)
test_data/sample_agents.json - Sample data (NEW)

Documentation

DESIGN_DOC_RETRIEVAL_STRATEGIES.md - Design comparison (NEW)
RETRIEVAL_STRATEGIES.md - Usage guide (NEW)
TEST_RIG_SUMMARY.md - Test results (NEW)
M365_INTEGRATION_PLAN.md - Integration plan (NEW)

Breaking Changes

❌ None - Agent strategy is the default, maintaining full backward compatibility.

Testing

Run the test suite:

python -m pytest code/test_who_handler.py -v

Run the test rig:

python test_rig/test_retrieval_strategies.py

Next Steps

Create corpus: Index M365 Apps data with sample queries
Real data testing: Test with M365 query sets
Optimization: Tune thresholds and aggregation parameters

Related Documentation

WHO Protocol Specification: who_protocol.txt
M365 Schema Analysis: data/m365/SCHEMA_ANALYSIS.md
M365 Embedding Analysis: data/m365/EMBEDDING_ANALYSIS.md

🤖 Generated with Claude Code

This commit implements two retrieval strategies (agent-level and query-level) with comprehensive testing infrastructure and documentation. ## New Features ### Retrieval Strategies - **Strategy 1 (Agent-Level)**: Direct retrieval of agent documents - One document per agent - Fast and simple (~500ms latency) - Default strategy for backward compatibility - **Strategy 2 (Query-Level)**: Sample query retrieval with aggregation - Multiple query documents per agent - Better semantic matching (query → sample queries) - Higher precision and explainability - Includes matched queries in results ### Implementation - Added `retrieval_strategy` parameter to WHO handler - Implemented `_process_agent_strategy()` for Strategy 1 - Implemented `_process_query_strategy()` for Strategy 2 - Added `_aggregate_by_agent()` for query document aggregation - Added `_rank_aggregated_agent()` for aggregated ranking ### API Updates - REST endpoint supports `?strategy=agent|query` parameter - MCP tool definition includes strategy parameter - Both GET and POST requests supported ### Test Infrastructure - Created comprehensive test suite (20 tests, all passing) - Built test rig with mock backends - Sample agent data (5 agents, 24 queries) - Comparison script showing query strategy wins 10/13 queries ## Files Added ### Core Implementation - `code/who_handler.py`: Both retrieval strategies - `code/agent_finder.py`: Updated endpoints with strategy parameter - `code/test_who_handler.py`: Comprehensive test suite (20 tests) ### Test Rig - `test_rig/test_retrieval_strategies.py`: Comparison script - `test_rig/mock_backends.py`: Mock search and LLM backends - `test_rig/README.md`: Test rig documentation - `test_data/sample_agents.json`: Sample agent data ### Documentation - `DESIGN_DOC_RETRIEVAL_STRATEGIES.md`: Detailed design comparison - `RETRIEVAL_STRATEGIES.md`: Usage guide - `TEST_RIG_SUMMARY.md`: Test results summary - `VERSION_CHECK.md`: Protocol version verification ### M365 Integration - `M365_INTEGRATION_PLAN.md`: Integration plan - `data/m365/SCHEMA_ANALYSIS.md`: TSV schema analysis - `data/m365/EMBEDDING_ANALYSIS.md`: Embedding format details - `data/m365/DOWNLOAD_CHECKLIST.md`: Data download guide - `scripts/download_m365_data.py`: Data download helper ### Other - `who_protocol.txt`: WHO protocol specification v0.1 - `.gitignore`: Ignore pycache, IDE files, and M365 data ## Test Results All 20 unit tests pass: - 4 protocol compliance tests - 6 protocol type tests - 2 filtering tests - 2 error handling tests - 2 ranking tests - 1 caching test - 4 retrieval strategy tests (NEW) Test rig comparison (13 queries): - Query Strategy: 10 wins (higher scores) - Agent Strategy: 0 wins - Ties: 3 ## Breaking Changes None - agent strategy is the default, maintaining backward compatibility. ## Usage ```bash # Agent strategy (default) curl "http://localhost:8080/who?query=help+me+write" # Query strategy curl "http://localhost:8080/who?query=help+me+write&strategy=query" ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement dual retrieval strategies for WHO handler#1

Implement dual retrieval strategies for WHO handler#1
rvguha wants to merge 1 commit intomainfrom
feature/dual-retrieval-strategies

rvguha commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rvguha commented Mar 13, 2026

Summary

Changes

🚀 New Features

📝 API Updates

🧪 Test Results

📊 Example Results

Files Changed

Core Implementation

Test Rig

Documentation

Breaking Changes

Testing

Next Steps

Related Documentation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant