Skip to content

feat(mcp): Add MCP Gateway integration for tool invocation and enrichment#671

Closed
manavgup wants to merge 1 commit intomainfrom
claude/implement-issue-653-01EuqiJs9sZZeUzRV85p8Nwj
Closed

feat(mcp): Add MCP Gateway integration for tool invocation and enrichment#671
manavgup wants to merge 1 commit intomainfrom
claude/implement-issue-653-01EuqiJs9sZZeUzRV85p8Nwj

Conversation

@manavgup
Copy link
Owner

Implements issue #653 - MCP Gateway Integration with simplified architecture following expert panel recommendations (Fowler, Newman, Nygard, Hohpe).

Implementation (~200 lines of core code)

Core Components

  • MCPGatewayClient: Resilient HTTP client with circuit breaker pattern

    • 5 failure threshold, 60s recovery timeout
    • Health checks with 5-second timeout
    • 30-second default timeout on all calls
    • Graceful degradation on failures
  • SearchResultEnricher: Content Enricher pattern implementation

    • Parallel tool invocation with concurrency limits
    • Error isolation (failures don't cascade)
    • Timeout handling with graceful degradation
  • mcp_router.py: Two API endpoints

    • POST /api/mcp/tools/invoke - Tool invocation
    • POST /api/mcp/enrich - Search result enrichment
    • GET /api/mcp/tools - List available tools
    • GET /api/mcp/health - Gateway health status

Configuration

  • Added MCP settings to Settings class (config.py)
  • Updated .env.example with MCP configuration
  • Added optional mcp-gateway service to docker-compose-infra.yml

Testing

  • Unit tests for circuit breaker pattern
  • Unit tests for enricher service
  • Unit tests for Pydantic schemas

Closes #653

…ment

Implements issue #653 - MCP Gateway Integration with simplified architecture
following expert panel recommendations (Fowler, Newman, Nygard, Hohpe).

## Implementation (~200 lines of core code)

### Core Components
- `MCPGatewayClient`: Resilient HTTP client with circuit breaker pattern
  - 5 failure threshold, 60s recovery timeout
  - Health checks with 5-second timeout
  - 30-second default timeout on all calls
  - Graceful degradation on failures

- `SearchResultEnricher`: Content Enricher pattern implementation
  - Parallel tool invocation with concurrency limits
  - Error isolation (failures don't cascade)
  - Timeout handling with graceful degradation

- `mcp_router.py`: Two API endpoints
  - POST /api/mcp/tools/invoke - Tool invocation
  - POST /api/mcp/enrich - Search result enrichment
  - GET /api/mcp/tools - List available tools
  - GET /api/mcp/health - Gateway health status

### Configuration
- Added MCP settings to Settings class (config.py)
- Updated .env.example with MCP configuration
- Added optional mcp-gateway service to docker-compose-infra.yml

### Testing
- Unit tests for circuit breaker pattern
- Unit tests for enricher service
- Unit tests for Pydantic schemas

Closes #653
@github-actions
Copy link
Contributor

🚀 Development Environment Options

This repository supports Dev Containers for a consistent development environment.

Option 1: GitHub Codespaces (Recommended)

Create a cloud-based development environment:

  1. Click the green Code button above
  2. Select the Codespaces tab
  3. Click Create codespace on claude/implement-issue-653-01EuqiJs9sZZeUzRV85p8Nwj
  4. Wait 2-3 minutes for environment setup
  5. Start coding with all tools pre-configured!

Option 2: VS Code Dev Containers (Local)

Use Dev Containers on your local machine:

  1. Install Docker Desktop
  2. Install VS Code
  3. Install the Dev Containers extension
  4. Clone this PR branch locally
  5. Open in VS Code and click "Reopen in Container" when prompted

Option 3: Traditional Local Setup

Set up the development environment manually:

# Clone the repository
git clone https://github.com/manavgup/rag_modulo.git
cd rag_modulo
git checkout claude/implement-issue-653-01EuqiJs9sZZeUzRV85p8Nwj

# Initialize development environment
make dev-init
make dev-build
make dev-up
make dev-validate

Available Commands

Once in your development environment:

make help           # Show all available commands
make dev-validate   # Validate environment setup
make test-atomic    # Run atomic tests
make test-unit      # Run unit tests
make lint          # Run linting

Services Available

When running make dev-up:


This automated message helps reviewers quickly set up the development environment.

@github-actions
Copy link
Contributor

Code Review - PR #671: MCP Gateway Integration

Summary

This PR implements MCP Gateway integration following a simplified architecture pattern. The implementation is production-grade with excellent resilience patterns, comprehensive testing, and clean separation of concerns. Overall, this is very high quality work that demonstrates strong software engineering practices.

🎯 Strengths

1. Excellent Architecture Decisions

  • ✅ Follows Content Enricher pattern (Gregor Hohpe) correctly
  • ✅ Clean separation: client → enricher → router layers
  • ✅ Graceful degradation design ensures MCP failures never block core RAG flow
  • ✅ Proper dependency injection with FastAPI Depends
  • ✅ Singleton pattern for MCP client (lines mcp_router.py:38-62)

2. Production-Grade Resilience Patterns

  • Circuit Breaker: Implements 3-state pattern (CLOSED/OPEN/HALF_OPEN) correctly
    • 5 failure threshold, 60s recovery timeout (configurable)
    • Prevents cascading failures to downstream services
  • Timeouts: Multiple levels (health check: 5s, tool invocation: 30s, enrichment: 60s)
  • Parallel Execution: Uses asyncio.Semaphore for controlled concurrency (enricher.py:219)
  • Error Isolation: Tool failures don't cascade (enricher.py:229-236)

3. Comprehensive Testing

  • 395 test lines across 3 test files
  • ✅ Tests cover: circuit breaker, enricher, schemas, health checks
  • ✅ Uses pytest-asyncio correctly for async tests
  • ✅ Proper mocking with AsyncMock for async operations
  • ✅ Tests edge cases: timeouts, partial failures, circuit breaker states

4. Code Quality

  • ✅ Type hints throughout (Python 3.12+ union syntax)
  • ✅ Comprehensive docstrings following Google style
  • ✅ Structured logging with contextual data
  • ✅ Pydantic schemas for validation (extra="forbid" prevents accidents)
  • ✅ Clean dataclasses for internal models

5. Security & Configuration

  • ✅ All endpoints require authentication (get_current_user dependency)
  • ✅ All MCP settings configurable via environment variables
  • ✅ Sensible defaults with override capability
  • ✅ API key support for gateway authentication
  • ✅ Input validation (e.g., query max 2000 chars, timeout 1-300s)

🔍 Issues & Recommendations

CRITICAL: Resource Leak 🔴

Location: gateway_client.py:184-196, mcp_router.py:42-62

Issue: The HTTP client is never closed, creating a resource leak. The singleton pattern creates one client but never calls close().

Impact:

  • Memory leaks in long-running processes
  • Socket exhaustion under high load
  • Connection pool not properly cleaned up

Fix Required:

# In main.py lifespan context manager (around line 222)
from rag_solution.router.mcp_router import _mcp_client

async def lifespan(_app: FastAPI) -> AsyncGenerator[None, None]:
    # Existing startup code...
    yield
    # CLEANUP: Close MCP client on shutdown
    global _mcp_client
    if _mcp_client:
        await _mcp_client.close()
        _mcp_client = None

OR use FastAPI startup/shutdown events in mcp_router.py.


HIGH: Circuit Breaker Race Condition 🟠

Location: gateway_client.py:94-120

Issue: can_execute() has race condition in HALF_OPEN state. Multiple concurrent requests could all pass the HALF_OPEN check before one completes.

Current Code:

if self.state == CircuitState.OPEN:
    if elapsed >= self.recovery_timeout:
        self.state = CircuitState.HALF_OPEN  # ← Race here
        return True
# HALF_OPEN: Allow one test request
return True  # ← All requests pass!

Fix:

from threading import Lock

class CircuitBreaker:
    def __init__(self, ...):
        # ...
        self._lock = Lock()
        self._half_open_test_in_progress = False
    
    def can_execute(self) -> bool:
        with self._lock:
            if self.state == CircuitState.HALF_OPEN:
                if self._half_open_test_in_progress:
                    raise CircuitBreakerOpenError(0)
                self._half_open_test_in_progress = True
                return True
            # ... rest of logic
    
    def record_success(self) -> None:
        with self._lock:
            # ... existing logic
            self._half_open_test_in_progress = False

MEDIUM: Missing Configuration Validation 🟡

Location: config.py:107-125

Issue: No validation that MCP settings are mutually consistent when mcp_enabled=True.

Example Problem: User sets MCP_ENABLED=true but MCP_GATEWAY_URL="" → runtime errors.

Fix:

from pydantic import field_validator, model_validator

class Settings(BaseSettings):
    # ... existing fields ...
    
    @model_validator(mode='after')
    def validate_mcp_config(self) -> 'Settings':
        if self.mcp_enabled:
            if not self.mcp_gateway_url or self.mcp_gateway_url == "http://localhost:8080":
                raise ValueError(
                    "MCP_GATEWAY_URL must be configured when MCP_ENABLED=true"
                )
        return self

MEDIUM: Inconsistent Error Handling 🟡

Location: enricher.py:187-197

Issue: Generic Exception catch-all can mask programming errors (e.g., AttributeError, TypeError).

Current Code:

except Exception as e:
    # Catch-all to ensure enrichment never crashes the main flow
    error_msg = f"Enrichment failed: {e!s}"
    logger.error(error_msg, exc_info=True)
    result.errors.append(error_msg)

Recommendation: Be more specific or at least separate expected vs unexpected errors:

except TimeoutError:
    # ... existing timeout handling
except (httpx.RequestError, httpx.HTTPError) as e:
    # Expected network errors
    error_msg = f"Enrichment network error: {e!s}"
    logger.warning(error_msg)
    result.errors.append(error_msg)
except Exception as e:
    # Unexpected errors (programming bugs)
    error_msg = f"Enrichment unexpected error: {e!s}"
    logger.exception(error_msg)  # Full stack trace
    result.errors.append(error_msg)

LOW: Missing Integration Test 🟡

Test Coverage: Unit tests are excellent, but missing end-to-end integration test.

Recommendation: Add integration test that verifies:

# tests/integration/test_mcp_integration.py
@pytest.mark.integration
async def test_mcp_enrichment_flow(test_client, auth_headers):
    """Test full MCP enrichment flow from API to gateway."""
    # 1. Check health endpoint
    response = await test_client.get("/api/mcp/health")
    # 2. List tools
    response = await test_client.get("/api/mcp/tools", headers=auth_headers)
    # 3. Enrich results (requires mock MCP gateway)
    # ...

LOW: Documentation Gaps 📝

Missing:

  1. How to set up MCP Context Forge Gateway (README section needed)
  2. Example of calling enrichment from search flow
  3. Performance impact documentation (added latency, resource usage)
  4. MCP gateway API contract/schema expectations

📊 Performance Considerations

Current Design Impact:

  • Latency: +30-60s per enrichment request (configurable timeout)
  • Concurrency: Limited to 3 tools in parallel (configurable)
  • Memory: HTTP client keeps connection pool (~1-5 MB)

Recommendations:

  1. Make enrichment truly async: Don't block search response

    # In search flow, after returning answer:
    asyncio.create_task(enrich_and_store_artifacts(result))
    # Return answer immediately to user
  2. Add caching: Cache tool results for duplicate queries

    from functools import lru_cache
    from hashlib import sha256
    
    def cache_key(answer, documents, tool):
        return sha256(f"{answer}{tool}".encode()).hexdigest()
  3. Add metrics: Track tool invocation times, success rates

    from prometheus_client import Histogram, Counter
    
    mcp_duration = Histogram('mcp_tool_duration_seconds', ...)
    mcp_failures = Counter('mcp_tool_failures_total', ...)

🧪 Testing Checklist

Unit Tests: ✅ Excellent (395 lines, 40+ test cases)
Integration Tests: ⚠️ Missing
E2E Tests: ⚠️ Missing
Performance Tests: ⚠️ Missing
Security Tests: ⚠️ Missing (auth bypass attempts, injection tests)


🔐 Security Review

Strengths:

  • ✅ All endpoints authenticated
  • ✅ Input validation on all schemas
  • extra="forbid" prevents parameter pollution
  • ✅ Timeout limits prevent resource exhaustion

Concerns:

  1. SSRF Risk: Gateway URL from config, but not validated

    • User could set MCP_GATEWAY_URL=http://internal-service
    • Fix: Validate gateway URL against allowlist or require HTTPS
  2. Injection in tool arguments: No sanitization of arguments dict

    • If gateway doesn't validate, malicious JSON could be injected
    • Fix: Add JSON schema validation per tool
  3. No rate limiting: User could spam enrichment requests

    • Fix: Add rate limiting decorator to enrichment endpoint

📋 Action Items

Must Fix Before Merge 🔴:

  • Fix HTTP client resource leak (add cleanup in lifespan)
  • Fix circuit breaker race condition in HALF_OPEN state
  • Add MCP configuration validation (mcp_enabled=True checks)

Should Fix 🟡:

  • Improve exception handling specificity in enricher
  • Add integration test for MCP flow
  • Document MCP setup in README
  • Add SSRF protection for gateway URL

Nice to Have 🔵:

  • Add async background enrichment (don't block search)
  • Add caching for duplicate enrichment requests
  • Add Prometheus metrics for observability
  • Add rate limiting to enrichment endpoint

✅ Compliance with Repository Standards

CLAUDE.md Compliance:

  • ✅ Line length: 120 chars (correct)
  • ✅ Type hints: Present throughout
  • ✅ Docstrings: Google style, comprehensive
  • ✅ Logging: Uses core.logging_utils.get_logger
  • ✅ Async/await: Proper usage
  • ✅ Service architecture: Follows repository pattern
  • ✅ Testing: pytest with async support
  • ✅ Import order: Correct (first-party, third-party, stdlib)

Missing (from CLAUDE.md):


🎓 Learning Value

This PR demonstrates excellent software engineering practices:

  • ✅ Resilience patterns (circuit breaker, timeouts, retries)
  • ✅ Clean architecture (separation of concerns)
  • ✅ Comprehensive testing strategy
  • ✅ Production-ready error handling
  • ✅ Graceful degradation design

Estimated Review Time: ~90 minutes
Code Quality: 8.5/10
Test Coverage: 9/10
Documentation: 7/10
Production Readiness: 7.5/10 (after fixing resource leak)


🚀 Recommendation

Conditional Approval - Fix the 3 MUST FIX items (resource leak, race condition, config validation), then this is ready to merge. The architecture and implementation quality are excellent.

Great work on this integration! The simplified approach (200 lines vs 2000+) was the right choice.


Generated with Claude Code - Automated PR Review

manavgup added a commit that referenced this pull request Nov 27, 2025
Add comprehensive architecture documentation for the Agentic RAG Platform:

- agentic-ui-architecture.md: React component hierarchy, state management,
  and API integration for agent features
- backend-architecture-diagram.md: Overall backend architecture with
  Mermaid diagrams showing service layers and data flow
- mcp-integration-architecture.md: MCP client/server integration strategy,
  PR comparison (#671 vs #684), and Context Forge integration
- rag-modulo-mcp-server-architecture.md: Exposing RAG capabilities as MCP
  server with tools (rag_search, rag_ingest, etc.) and resources
- search-agent-hooks-architecture.md: 3-stage agent pipeline (pre-search,
  post-search, response) with database schema and execution flow
- system-architecture.md: Complete system architecture overview with
  technology stack and data flows

These documents guide implementation of:
- PR #695 (SPIFFE/SPIRE agent identity)
- PR #671 (MCP Gateway client)
- Issue #697 (Agent execution hooks)
- Issue #698 (MCP Server)
- Issue #699 (Agentic UI)

Closes #696

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
manavgup added a commit that referenced this pull request Nov 27, 2025
Add comprehensive architecture documentation for the Agentic RAG Platform:

- agentic-ui-architecture.md: React component hierarchy, state management,
  and API integration for agent features
- backend-architecture-diagram.md: Overall backend architecture with
  Mermaid diagrams showing service layers and data flow
- mcp-integration-architecture.md: MCP client/server integration strategy,
  PR comparison (#671 vs #684), and Context Forge integration
- rag-modulo-mcp-server-architecture.md: Exposing RAG capabilities as MCP
  server with tools (rag_search, rag_ingest, etc.) and resources
- search-agent-hooks-architecture.md: 3-stage agent pipeline (pre-search,
  post-search, response) with database schema and execution flow
- system-architecture.md: Complete system architecture overview with
  technology stack and data flows

These documents guide implementation of:
- PR #695 (SPIFFE/SPIRE agent identity)
- PR #671 (MCP Gateway client)
- Issue #697 (Agent execution hooks)
- Issue #698 (MCP Server)
- Issue #699 (Agentic UI)

Closes #696

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
manavgup added a commit that referenced this pull request Nov 27, 2025
Add comprehensive architecture documentation for the Agentic RAG Platform:

- agentic-ui-architecture.md: React component hierarchy, state management,
  and API integration for agent features
- backend-architecture-diagram.md: Overall backend architecture with
  Mermaid diagrams showing service layers and data flow
- mcp-integration-architecture.md: MCP client/server integration strategy,
  PR comparison (#671 vs #684), and Context Forge integration
- rag-modulo-mcp-server-architecture.md: Exposing RAG capabilities as MCP
  server with tools (rag_search, rag_ingest, etc.) and resources
- search-agent-hooks-architecture.md: 3-stage agent pipeline (pre-search,
  post-search, response) with database schema and execution flow
- system-architecture.md: Complete system architecture overview with
  technology stack and data flows

These documents guide implementation of:
- PR #695 (SPIFFE/SPIRE agent identity)
- PR #671 (MCP Gateway client)
- Issue #697 (Agent execution hooks)
- Issue #698 (MCP Server)
- Issue #699 (Agentic UI)

Closes #696

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
@manavgup
Copy link
Owner Author

Closing in favor of PR #684 which uses preferred file naming convention (mcp_gateway_client.py, search_result_enricher.py) and places files in services/ directory following existing architecture patterns. See architecture decision in docs/architecture/mcp-integration-architecture.md.

@manavgup manavgup closed this Nov 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MCP Gateway Integration: Expert Architecture Review & Implementation Plan

2 participants

Comments