Implement dual retrieval strategies for WHO handler#1
Open
Conversation
This commit implements two retrieval strategies (agent-level and query-level) with comprehensive testing infrastructure and documentation. ## New Features ### Retrieval Strategies - **Strategy 1 (Agent-Level)**: Direct retrieval of agent documents - One document per agent - Fast and simple (~500ms latency) - Default strategy for backward compatibility - **Strategy 2 (Query-Level)**: Sample query retrieval with aggregation - Multiple query documents per agent - Better semantic matching (query → sample queries) - Higher precision and explainability - Includes matched queries in results ### Implementation - Added `retrieval_strategy` parameter to WHO handler - Implemented `_process_agent_strategy()` for Strategy 1 - Implemented `_process_query_strategy()` for Strategy 2 - Added `_aggregate_by_agent()` for query document aggregation - Added `_rank_aggregated_agent()` for aggregated ranking ### API Updates - REST endpoint supports `?strategy=agent|query` parameter - MCP tool definition includes strategy parameter - Both GET and POST requests supported ### Test Infrastructure - Created comprehensive test suite (20 tests, all passing) - Built test rig with mock backends - Sample agent data (5 agents, 24 queries) - Comparison script showing query strategy wins 10/13 queries ## Files Added ### Core Implementation - `code/who_handler.py`: Both retrieval strategies - `code/agent_finder.py`: Updated endpoints with strategy parameter - `code/test_who_handler.py`: Comprehensive test suite (20 tests) ### Test Rig - `test_rig/test_retrieval_strategies.py`: Comparison script - `test_rig/mock_backends.py`: Mock search and LLM backends - `test_rig/README.md`: Test rig documentation - `test_data/sample_agents.json`: Sample agent data ### Documentation - `DESIGN_DOC_RETRIEVAL_STRATEGIES.md`: Detailed design comparison - `RETRIEVAL_STRATEGIES.md`: Usage guide - `TEST_RIG_SUMMARY.md`: Test results summary - `VERSION_CHECK.md`: Protocol version verification ### M365 Integration - `M365_INTEGRATION_PLAN.md`: Integration plan - `data/m365/SCHEMA_ANALYSIS.md`: TSV schema analysis - `data/m365/EMBEDDING_ANALYSIS.md`: Embedding format details - `data/m365/DOWNLOAD_CHECKLIST.md`: Data download guide - `scripts/download_m365_data.py`: Data download helper ### Other - `who_protocol.txt`: WHO protocol specification v0.1 - `.gitignore`: Ignore pycache, IDE files, and M365 data ## Test Results All 20 unit tests pass: - 4 protocol compliance tests - 6 protocol type tests - 2 filtering tests - 2 error handling tests - 2 ranking tests - 1 caching test - 4 retrieval strategy tests (NEW) Test rig comparison (13 queries): - Query Strategy: 10 wins (higher scores) - Agent Strategy: 0 wins - Ties: 3 ## Breaking Changes None - agent strategy is the default, maintaining backward compatibility. ## Usage ```bash # Agent strategy (default) curl "http://localhost:8080/who?query=help+me+write" # Query strategy curl "http://localhost:8080/who?query=help+me+write&strategy=query" ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR implements two retrieval strategies (agent-level and query-level) for the WHO handler, with comprehensive testing infrastructure and documentation.
Changes
🚀 New Features
Retrieval Strategy 1: Agent-Level (Default)
Retrieval Strategy 2: Query-Level (Recommended)
📝 API Updates
REST API:
MCP Tool:
{ "query": {"text": "help me write"}, "meta": {"strategy": "query"} }🧪 Test Results
Unit Tests: All 20 tests pass ✅
Test Rig Results: Query strategy wins 10/13 queries
Average scores when both find correct agent:
📊 Example Results
Files Changed
Core Implementation
code/who_handler.py- Both retrieval strategies (+200 lines)code/agent_finder.py- API updates for strategy parametercode/test_who_handler.py- Comprehensive test suite (NEW)Test Rig
test_rig/test_retrieval_strategies.py- Comparison script (NEW)test_rig/mock_backends.py- Mock backends (NEW)test_data/sample_agents.json- Sample data (NEW)Documentation
DESIGN_DOC_RETRIEVAL_STRATEGIES.md- Design comparison (NEW)RETRIEVAL_STRATEGIES.md- Usage guide (NEW)TEST_RIG_SUMMARY.md- Test results (NEW)M365_INTEGRATION_PLAN.md- Integration plan (NEW)Breaking Changes
❌ None - Agent strategy is the default, maintaining full backward compatibility.
Testing
Run the test suite:
Run the test rig:
Next Steps
Related Documentation
who_protocol.txtdata/m365/SCHEMA_ANALYSIS.mddata/m365/EMBEDDING_ANALYSIS.md🤖 Generated with Claude Code