An intelligent multi-agent AI system powered by LangChain + LangGraph + MCP (Model Context Protocol) with ReAct-style reasoning. Integrates 5 specialized tool servers (Math, Weather, Translation, Web Search, Gmail) with persistent FAISS memory and FastAPI web interface.
- ReAct Loop (Reasoning + Acting): LLM autonomously decides when to use tools through ThinkโActโObserve cycles
- Native Tool Calling: Zero hardcoded routing - agent interprets tool schemas and chooses appropriate tools
- Multi-Step Task Handling: Automatically chains tool calls for complex queries (e.g., "search AI news and email results")
- Duplicate Call Prevention: Built-in loop detection prevents stuck tool calls
| Tool | Capabilities | Transport |
|---|---|---|
| ๐งฎ Math Server | Symbolic math (derivatives, integrals, equation solving), calculator | stdio |
| ๐ฆ๏ธ Weather | Current weather, air quality via Open-Meteo API | stdio |
| ๐ Translate | Multi-language translation via Google Translate API | stdio |
| ๐ Web Search | Real-time web search powered by Tavily API | stdio |
| ๐ง Gmail | Send emails, read inbox via Gmail API | stdio |
- FAISS Vector Database: Stores conversation history as semantic embeddings
- Context Retrieval: Automatically fetches top 3 relevant past interactions
- Conversation Summarization: Summarizes long conversations every 6+ exchanges
- Cross-Session Memory: Maintains context between chat sessions
- FastAPI Backend: RESTful API with CORS support
- Chat Endpoint:
/chat- Process user messages with full agent capabilities - Clear History:
/clear- Reset conversation memory - Health Check:
/health- Monitor system status
- API Key Rotation: Supports 3 Groq API keys for rate limit handling
- Retry Logic: Automatic retry with exponential backoff for rate limits
- Error Handling: Graceful degradation with timeout protection (60s per tool)
- Truncation: Smart result truncation to prevent token overflow
graph TB
User[๐ค User Input] --> Client[๐ฌ Client Interface<br/>Terminal or Web API]
Client --> AgentLLM{๐ง ReAct Agent<br/>LLaMA-3.1-8B}
AgentLLM -->|"Think: Need tool?"| Decision{Decision}
Decision -->|No tool needed| DirectReply[๐ Direct Response]
Decision -->|Tool needed| ToolCall[๐ง Tool Execution]
ToolCall --> MCPServers[๐ ๏ธ MCP Tool Servers]
MCPServers --> MathTool[๐งฎ Math Server<br/>sympy + scipy]
MCPServers --> WeatherTool[๐ฆ๏ธ Weather<br/>Open-Meteo API]
MCPServers --> TranslateTool[๐ Translate<br/>Google Translate]
MCPServers --> SearchTool[๐ Web Search<br/>Tavily API]
MCPServers --> GmailTool[๐ง Gmail<br/>Gmail API]
ToolCall -->|Result| Observe[๐๏ธ Observe Result]
Observe -->|Add to context| AgentLLM
AgentLLM -->|Final answer| Response[โ
Final Response]
Response --> Memory[(๐ง FAISS Memory<br/>HuggingFace Embeddings)]
DirectReply --> Memory
Memory -->|Retrieve context| AgentLLM
Response --> User
DirectReply --> User
subgraph "ReAct Loop (Max 6 iterations)"
AgentLLM
Decision
ToolCall
Observe
end
subgraph "FastAPI Server (Optional)"
API[๐ Web Interface<br/>Port 8080]
end
User -.->|HTTP POST| API
API -.-> Client
style AgentLLM fill:#ff9999
style Memory fill:#ffcc99
style MCPServers fill:#e6b3ff
style Decision fill:#99ff99
- Input Processing: User sends query via CLI or API
- ReAct Loop: Agent enters ThinkโActโObserve cycle:
- Think: Decides if a tool is needed based on query
- Act: Calls appropriate tool(s) with generated arguments
- Observe: Processes tool output and decides next action
- Memory Integration: FAISS retrieves relevant past context automatically
- Response Generation: Agent synthesizes final answer after โค6 tool calls
- Memory Update: Conversation stored as embeddings for future retrieval
agentic-ai-mcp/
โโโ client.py # Main CLI chat interface (ReAct agent)
โโโ server.py # FastAPI web server
โโโ intent_router.py # LLM-based query understanding
โโโ executioner.py # Tool execution pipeline
โโโ rule_based_verifier.py # Response validation logic
โ
โโโ Tool Servers (MCP)
โ โโโ mathserver.py # Math calculations (sympy, scipy)
โ โโโ weather.py # Weather & air quality
โ โโโ translate.py # Language translation
โ โโโ websearch.py # Tavily web search
โ โโโ gmail.py # Gmail send/read
โ
โโโ Utilities
โ โโโ debug_script.py # Test MCP server connectivity
โ โโโ test_servers.py # Validate individual servers
โ โโโ personalized_task.py # Custom workflow placeholder
โ
โโโ Configuration
โ โโโ .env # API keys (not in repo)
โ โโโ requirment.txt # Python dependencies
โ โโโ pyproject.toml # Project metadata
โ โโโ .gitignore
โ
โโโ Memory Storage
โ โโโ faiss_index/
โ โโโ index.faiss # Vector embeddings
โ
โโโ Auth (not in repo)
โ โโโ credentials.json # Google OAuth credentials
โ โโโ token.json # Gmail access token
โ
โโโ README.md
- Python 3.13+
- Git
- API Keys:
- Groq API (for LLaMA models)
- Tavily API (for web search)
- Google Cloud (for Gmail - optional)
git clone https://github.com/sobhan2204/agentic-ai-mcp.git
cd agentic-ai-mcppip install -r requirment.txtCreate a .env file:
# Groq API Keys (get from https://console.groq.com)
GROQ_API_KEY_1=gsk_xxxxxxxxxxxxxxxxxxxxx
GROQ_API_KEY_2=gsk_xxxxxxxxxxxxxxxxxxxxx # Optional for rotation
GROQ_API_KEY_3=gsk_xxxxxxxxxxxxxxxxxxxxx # Optional
# Tavily API (get from https://tavily.com)
TAVILY_API_KEY=tvly-xxxxxxxxxxxxxxxxIf using Gmail features:
- Create OAuth credentials at Google Cloud Console
- Download as
credentials.json - First run will open browser for authentication
- Token saved as
token.jsonfor future use
python client.pyExample Interaction:
You: what is 5 + 5
[Tool: solve_math({"query": "5 + 5"})]
Assistant: The result is 10.
You: translate "good morning" to french
[Tool: translate({"sentence": "good morning", "target_language": "french"})]
Assistant: "Good morning" in French is "bonjour".
You: search latest AI news and email to bob@test.com
[Tool: search_web({"query": "latest AI news"})]
[Tool: send_email({"recipient": "bob@test.com", "subject": "AI Update", "body": "..."})]
Assistant: I found recent AI news and sent it to bob@test.com.
You: clear # Reset conversation memory
Conversation cleared.
You: exit
Bye bye!
Start the server:
python server.pyAPI Endpoints:
# Send chat message
curl -X POST http://localhost:8080/chat \
-H "Content-Type: application/json" \
-d '{"message": "what is the weather in London?"}'
# Clear conversation
curl -X POST http://localhost:8080/clear
# Health check
curl http://localhost:8080/healthBefore running the main client, verify all tools work:
python test_servers.pyExpected output:
Testing mathserver.py... โ
WORKING
Testing weather.py... โ
WORKING
Testing translate.py... โ
WORKING
Testing websearch.py... โ
WORKING
Testing gmail.py... โ
WORKING
| Query | Tools Used | Result |
|---|---|---|
| "solve x^2 - 4 = 0" | solve_math |
Solutions: x = -2, 2 |
| "weather in Tokyo" | get_weather |
Current weather with temperature |
| "translate 'hello' to spanish" | translate |
"hola" |
| "latest news on AI" | search_web |
Top 3 news articles with sources |
| "send email to alice@test.com" | send_email |
Email sent confirmation |
| "search AI news and email it to bob@test.com" | search_web โ send_email |
Multi-step: search then email |
Edit in client.py:
SUMMARIZE_AFTER = 6 # Summarize every N user-assistant exchanges
MAX_REACT_STEPS = 6 # Max tool calls per turn
TOOL_TIMEOUT = 60 # Seconds before tool call times outEdit in client.py:
agent_llm = ChatGroq(
model="llama-3.1-8b-instant", # Fast for tool calling
max_tokens=1024,
temperature=0.0, # Deterministic for reliability
)The system automatically cycles through 3 API keys to handle rate limits. Configure in .env.
| Command | Action |
|---|---|
clear |
Reset FAISS memory and conversation history |
exit / quit / q |
Exit the chat |
| Any other text | Process as user query |
Contributions welcome! To add a new tool:
- Create
newtool.pyfollowing MCP FastMCP format - Register in
client.pyMultiServerMCPClientconfig - Test with
test_servers.py - Update this README
| Component | Technology |
|---|---|
| Agent Framework | LangChain + LangGraph |
| LLM | Groq (LLaMA-3.1-8B-Instant) |
| Tool Protocol | MCP (FastMCP) |
| Vector DB | FAISS |
| Embeddings | HuggingFace (all-MiniLM-L6-v2) |
| Web Framework | FastAPI |
| Math Engine | SymPy + SciPy |
| Translation | deep-translator (Google Translate) |
| Search | Tavily API |
| Gmail API (OAuth2) |
- LangChain for agent framework
- Groq for fast LLM inference
- Tavily for web search API
- FastMCP for tool server protocol
Made with โค๏ธ by Sobhan