feat: True Agentic RAG with ReAct Pattern Implementation

# True Agentic RAG with ReAct Pattern Implementation

## Executive Summary

**Epic**: #691 (Agentic AI Platform v2)
**Architecture Decision**: #245 (IBM MCP Context Forge + Custom Agent Layer)
**Related Issues**: #714, #697, #699, #698

### Current State
RAG Modulo has excellent foundations:
- ✅ **MCP Server (PR #715)** - RAG Modulo exposes itself as an MCP server
- ✅ **IBM Context Forge integration** - Can consume external MCP tools
- ✅ **Tool Executor Service (#714)** - Post-search tool invocation (470 lines)
- ✅ **Chain of Thought reasoning** - Structured reasoning capabilities

### Gap
Issue #714 implements **temporal tool selection** (user picks tools → tools run after search).

**True Agentic RAG** requires **LLM-driven tool orchestration** where the LLM decides what tools to call in a reasoning loop (ReAct pattern).

### Recommended Approach
Hybrid implementation using **LangGraph** for agent orchestration while leveraging existing RAG Modulo services, built on IBM MCP Context Forge foundation per #245.

---

## Architecture Overview

```
┌─────────────────────────────────────────────────────────────────────┐
│                        Agentic RAG System                           │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  ┌──────────────────────────────────────────────────────────────┐  │
│  │                    Agent Orchestrator                         │  │
│  │  ┌─────────┐    ┌─────────┐    ┌─────────┐    ┌─────────┐   │  │
│  │  │ OBSERVE │───▶│  THINK  │───▶│   ACT   │───▶│ OBSERVE │   │  │
│  │  │  (Input) │    │ (Reason)│    │ (Execute)│    │ (Result)│   │  │
│  │  └─────────┘    └─────────┘    └─────────┘    └─────────┘   │  │
│  │       │              │              │              │         │  │
│  │       ▼              ▼              ▼              ▼         │  │
│  │  ┌─────────────────────────────────────────────────────┐    │  │
│  │  │              Agent State Manager                     │    │  │
│  │  │  • Conversation history  • Tool results              │    │  │
│  │  │  • Reasoning traces      • Task progress             │    │  │
│  │  └─────────────────────────────────────────────────────┘    │  │
│  └──────────────────────────────────────────────────────────────┘  │
│                                │                                    │
│                                ▼                                    │
│  ┌──────────────────────────────────────────────────────────────┐  │
│  │                     Action Router                             │  │
│  │                                                               │  │
│  │   ┌────────────┐   ┌────────────┐   ┌────────────────────┐   │  │
│  │   │ RAG Search │   │  MCP Tools │   │  Internal Services  │   │  │
│  │   │  (existing)│   │(via Context│   │ (CoT, Synthesis)   │   │  │
│  │   │            │   │   Forge)   │   │                    │   │  │
│  │   └────────────┘   └────────────┘   └────────────────────┘   │  │
│  └──────────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────┘
```

---

## ReAct Pattern Loop

```
User Query
    │
    ▼
┌─────────────────────────────────────────────────────────────┐
│  OBSERVE: Parse query, gather context, check available tools │
└─────────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────────┐
│  THINK: Analyze what's needed, plan approach, select action  │
│         "I need to search the knowledge base first, then     │
│          use the chart-generator tool to visualize results"  │
└─────────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────────┐
│  ACT: Execute the selected action (search, tool call, etc.)  │
└─────────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────────┐
│  OBSERVE: Examine results, update state                      │
└─────────────────────────────────────────────────────────────┘
    │
    ▼
┌───────────────────────────────────────────────┐
│  Is task complete?                            │
│    YES → Return final answer + artifacts      │
│    NO  → Loop back to THINK                   │
└───────────────────────────────────────────────┘
```

---

## Implementation Phases

### Phase 1: Agent Foundation
**Files to create:**
- [ ] `backend/rag_solution/agents/__init__.py`
- [ ] `backend/rag_solution/agents/state.py` - AgentState, AgentStep
- [ ] `backend/rag_solution/agents/tool_registry.py` - Dynamic tool discovery
- [ ] `backend/rag_solution/agents/react_agent.py` - Core ReAct implementation
- [ ] `backend/rag_solution/agents/prompts.py` - ReAct prompt templates

**Key Components:**

#### 1.1 Agent State Management
\`\`\`python
# backend/rag_solution/agents/state.py
from dataclasses import dataclass, field
from enum import Enum
from typing import Any
from uuid import UUID
import time

class AgentStepType(str, Enum):
    OBSERVE = "observe"
    THINK = "think"
    ACT = "act"
    FINAL = "final"

@dataclass
class AgentStep:
    step_type: AgentStepType
    content: str
    tool_calls: list[dict] | None = None
    tool_results: list[dict] | None = None
    reasoning: str | None = None
    timestamp: float = field(default_factory=time.time)

@dataclass
class AgentState:
    """Maintains state across ReAct iterations."""
    task_id: UUID
    user_id: UUID
    collection_id: UUID
    original_query: str
    steps: list[AgentStep] = field(default_factory=list)
    available_tools: list[dict] = field(default_factory=list)
    search_results: list[dict] = field(default_factory=list)
    artifacts: list[dict] = field(default_factory=list)
    iteration_count: int = 0
    max_iterations: int = 10
    is_complete: bool = False
    final_answer: str | None = None
\`\`\`

#### 1.2 Tool Registry with Dynamic Discovery
\`\`\`python
# backend/rag_solution/agents/tool_registry.py
class ToolRegistry:
    """Discovers and manages available tools from MCP Context Forge."""
    
    def __init__(self, mcp_client: MCPGatewayClient):
        self.mcp_client = mcp_client
        self._tools_cache: dict[str, ToolDefinition] = {}
    
    async def discover_tools(self, user_id: UUID) -> list[ToolDefinition]:
        """Fetch available tools from Context Forge."""
        tools = await self.mcp_client.list_available_tools()
        return [self._format_for_llm(t) for t in tools]
    
    def get_tool_schema_for_prompt(self) -> str:
        """Generate tool descriptions for ReAct prompts."""
        # Returns formatted string of available tools for LLM
\`\`\`

#### 1.3 ReAct Agent Core
\`\`\`python
# backend/rag_solution/agents/react_agent.py
class ReActAgent:
    """Implements ReAct pattern for agentic RAG."""
    
    async def run(self, query: str, state: AgentState) -> AgentState:
        """Execute ReAct loop until completion or max iterations."""
        while not state.is_complete and state.iteration_count < state.max_iterations:
            # OBSERVE: Gather current context
            observation = await self._observe(state)
            state.steps.append(AgentStep(AgentStepType.OBSERVE, observation))
            
            # THINK: Reason about next action
            thought = await self._think(state, observation)
            state.steps.append(AgentStep(AgentStepType.THINK, thought.reasoning))
            
            # Check if we should conclude
            if thought.action == "finish":
                state.is_complete = True
                state.final_answer = thought.final_answer
                break
            
            # ACT: Execute the decided action
            result = await self._act(state, thought)
            state.steps.append(AgentStep(
                AgentStepType.ACT, 
                f"Executed: {thought.action}",
                tool_results=[result]
            ))
            
            state.iteration_count += 1
        
        return state
\`\`\`

---

### Phase 2: Integration with Existing Services
**Files to create/modify:**
- [ ] `backend/rag_solution/agents/tools/rag_search_tool.py`
- [ ] `backend/rag_solution/agents/tools/mcp_tool_wrapper.py`
- [ ] `backend/rag_solution/services/agent_service.py`
- [ ] Modify `backend/rag_solution/services/tool_executor_service.py`
- [ ] Modify `backend/rag_solution/services/mcp_gateway_client.py`

#### 2.1 Search as a Tool
\`\`\`python
class RAGSearchTool:
    """Wraps existing SearchService as an agent tool."""
    name = "rag_search"
    description = "Search the knowledge base for relevant documents"
    
    async def execute(self, query: str, collection_id: UUID, user_id: UUID):
        search_input = SearchInput(
            question=query,
            collection_id=collection_id,
            user_id=user_id,
        )
        return await self.search_service.search(search_input)
\`\`\`

#### 2.2 Agent Service Layer
\`\`\`python
# backend/rag_solution/services/agent_service.py
class AgentService:
    """Service layer for agentic RAG operations."""
    
    async def execute_agentic_search(
        self,
        query: str,
        collection_id: UUID,
        user_id: UUID,
        mode: AgentMode = AgentMode.REACT,
    ) -> AgentSearchResponse:
        """Execute search with agent orchestration."""
        state = AgentState(
            task_id=uuid4(),
            user_id=user_id,
            collection_id=collection_id,
            original_query=query,
        )
        state.available_tools = await self.tool_registry.discover_tools(user_id)
        agent = ReActAgent(...)
        final_state = await agent.run(query, state)
        
        return AgentSearchResponse(
            answer=final_state.final_answer,
            reasoning_steps=[s.content for s in final_state.steps],
            artifacts=final_state.artifacts,
            tool_calls=[...],
        )
\`\`\`

---

### Phase 3: Enhanced Capabilities
**Files to create:**
- [ ] `backend/rag_solution/agents/planning_agent.py` - Multi-step planning
- [ ] `backend/rag_solution/agents/memory.py` - Agent memory layer
- [ ] `backend/rag_solution/agents/reflection.py` - Self-reflection

#### 3.1 Multi-Step Planning
\`\`\`python
class PlanningAgent(ReActAgent):
    """Agent that creates and follows a plan."""
    
    async def _create_plan(self, query: str, tools: list) -> Plan:
        """Create execution plan before acting."""
\`\`\`

#### 3.2 Memory Layer
\`\`\`python
class AgentMemory:
    """Persistent memory across agent sessions."""
    
    async def store_interaction(self, state: AgentState):
        """Store completed agent interaction in vector DB."""
        
    async def recall_similar(self, query: str, limit: int = 5):
        """Recall similar past interactions."""
\`\`\`

#### 3.3 Self-Reflection
\`\`\`python
async def _reflect(self, state: AgentState) -> str:
    """Agent reflects on progress and adjusts strategy."""
\`\`\`

---

### Phase 4: API and Frontend
**Files to create/modify:**
- [ ] `backend/rag_solution/router/agent_router.py`
- [ ] `backend/rag_solution/schemas/agent_schema.py`
- [ ] `frontend/src/components/search/AgentReasoningPanel.tsx`
- [ ] Modify `frontend/src/components/search/LightweightSearchInterface.tsx`

#### 4.1 New API Endpoints
\`\`\`python
@router.post("/agent/search", response_model=AgentSearchResponse)
async def agentic_search(request: AgentSearchRequest):
    """Execute agentic search with ReAct reasoning."""

@router.get("/agent/search/{task_id}/stream")
async def stream_agent_progress(task_id: UUID):
    """Stream agent reasoning steps in real-time (SSE/WebSocket)."""
\`\`\`

#### 4.2 Frontend Integration
\`\`\`typescript
interface AgentReasoningPanelProps {
  steps: AgentStep[];
  isRunning: boolean;
}

export const AgentReasoningPanel: React.FC<AgentReasoningPanelProps> = ({
  steps, isRunning,
}) => {
  return (
    <div className="agent-reasoning-panel">
      {steps.map((step, i) => <AgentStepCard key={i} step={step} />)}
      {isRunning && <LoadingIndicator />}
    </div>
  );
};
\`\`\`

---

## Relationship to Existing Issues

| Issue | Role in True Agentic RAG |
|-------|--------------------------|
| **#714** | Provides `ToolExecutorService` - reuse for ACT phase |
| **#697** | 3-stage hooks become specialized agents (pre-search, post-search) |
| **#698** | MCP Server foundation (DONE) - RAG tools exposed to external agents |
| **#699** | Frontend components for displaying agent artifacts |
| **#691** | Epic roadmap - this plan implements Phase 4 vision |
| **#245** | Architecture decision - use IBM MCP Context Forge |

---

## Technical Decisions

### 1. Framework Choice: LangGraph
- Best for state machine-based agent loops
- Native support for ReAct pattern
- Integrates well with existing LangChain tools
- Production-ready with streaming support

### 2. Tool Execution: Hybrid Approach
- Synchronous tools: Direct execution
- Async tools: Background with polling/callbacks
- Artifacts: Stored in MinIO, references in response

### 3. LLM for Reasoning
- Use existing provider system (WatsonX/OpenAI/Anthropic)
- Leverage CoT service for structured reasoning
- Function calling for tool selection

---

## Success Criteria (from #691)

- [ ] Agent can execute multi-step tasks (ReAct loop)
- [ ] MCP tools integrated as agent actions
- [ ] Reasoning traces visible to users
- [ ] Memory persists across sessions
- [ ] Self-reflection enables error recovery
- [ ] Streaming support for real-time progress
- [ ] All existing search functionality preserved

---

## Dependencies

### Python Packages (to add)
\`\`\`toml
langgraph = "^0.2"  # Agent state machine
langchain-core = "^0.3"  # Tool abstractions
\`\`\`

### Infrastructure
- Existing MCP Context Forge (docker-compose-infra.yml)
- Existing Milvus for memory storage
- Existing Redis for caching (optional)

Issue	Role in True Agentic RAG
#714	Provides `ToolExecutorService` - reuse for ACT phase
#697	3-stage hooks become specialized agents (pre-search, post-search)
#698	MCP Server foundation (DONE) - RAG tools exposed to external agents
#699	Frontend components for displaying agent artifacts
#691	Epic roadmap - this plan implements Phase 4 vision
#245	Architecture decision - use IBM MCP Context Forge

feat: True Agentic RAG with ReAct Pattern Implementation #722

Description

True Agentic RAG with ReAct Pattern Implementation

Executive Summary

Current State

Gap

Recommended Approach

Architecture Overview

ReAct Pattern Loop

Implementation Phases

Phase 1: Agent Foundation

1.1 Agent State Management

backend/rag_solution/agents/state.py

1.2 Tool Registry with Dynamic Discovery

backend/rag_solution/agents/tool_registry.py

1.3 ReAct Agent Core

backend/rag_solution/agents/react_agent.py

Phase 2: Integration with Existing Services

2.1 Search as a Tool

2.2 Agent Service Layer

backend/rag_solution/services/agent_service.py

Phase 3: Enhanced Capabilities

3.1 Multi-Step Planning

3.2 Memory Layer

3.3 Self-Reflection

Phase 4: API and Frontend

4.1 New API Endpoints

4.2 Frontend Integration

Relationship to Existing Issues

Technical Decisions

1. Framework Choice: LangGraph

2. Tool Execution: Hybrid Approach

3. LLM for Reasoning

Success Criteria (from #691)

Dependencies

Python Packages (to add)

Infrastructure

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions