A sophisticated Multi-Agent AI System for automated Know Your Customer (KYC) verification, built with LangGraph state machines, Google Gemini LLM, and a tiered memory architecture. Designed for insurance and financial services applications.
- Overview
- Features
- Architecture
- Tech Stack
- Project Structure
- Prerequisites
- Installation
- Configuration
- Running the Application
- API Reference
- Usage Examples
- Memory System
- Troubleshooting
- Contributing
- License
The Multi-Agent AI KYC System automates the customer identity verification process for TATA AIA Life Insurance. It uses a hierarchical multi-agent architecture where:
- A Main Orchestrator routes user intents to appropriate workflows
- Specialist Agents handle specific document verifications (Aadhaar, PAN, Passport, DL, Form 60)
- LangGraph State Machines manage complex, interruptible verification workflows
- Tiered Memory System maintains conversation context across sessions
The system supports both Web Interface (Flask + WebSocket) and CLI modes.
- Main Orchestrator: Intent recognition and routing using structured LLM outputs
- KYC Manager Agent: Delegates tasks to specialist agents
- Specialist Agents: Aadhaar, PAN, Passport, DL, and Form 60 verification
- General Query Agent: Handles insurance-related questions
- PAN Check Agent: Determines if user needs PAN or Form 60
- Aadhaar Verification: eKYC via OTP or document upload with OCR
- PAN Card Verification: NSDL database validation + OCR support
- Passport Verification: Document extraction and validation
- Driving License: OCR-based verification
- Form 60: For users without PAN card
- L1 Working Memory: Redis-based short-term conversation buffer
- L2 Episodic Memory: LLM-generated summaries stored in Redis
- L3 Semantic Memory: Mem0-powered long-term user preferences
- Real-time WebSocket communication
- Interactive FAQ sidebar
- Live process status tracking
- Responsive design for mobile/desktop
- Typing indicators and smooth animations
- LangSmith integration for tracing and debugging
- Structured outputs with Pydantic models
- Comprehensive error handling with retry mechanisms
- Modular, extensible architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β User Interface β
β (Web App / CLI / API Client) β
βββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Main Orchestrator β
β β’ Intent Recognition (LLM + Pydantic Structured Output) β
β β’ Workflow Routing β
β β’ Memory Context Integration β
βββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββΌββββββββββββββββ
βΌ βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β KYC Manager β β General Query β β PAN Check β
β Agent β β Agent β β Agent β
ββββββββββ¬βββββββββ βββββββββββββββββββ βββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Specialist Agents β
β ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ βββββββββββ
β β Aadhaar β β PAN β β Passport β β DL β βForm 60 ββ
β β Agent β β Agent β β Agent β β Agent β β Agent ββ
β β(LangGraphβ β(LangGraphβ β(LangGraphβ β(LangGraphβ β ββ
β β FSM) β β FSM) β β FSM) β β FSM) β β ββ
β ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ βββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β External Services β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β UIDAI DB β β NSDL DB β β Document β β
β β (Aadhaar) β β (PAN) β β Intelligence β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Component | Technology |
|---|---|
| LLM | Google Gemini 2.5 Flash/Pro |
| Agent Framework | LangGraph (State Machines) |
| Observability | LangSmith |
| Memory - L1 | Redis |
| Memory - L3 | Mem0 |
| Web Framework | Flask + Flask-SocketIO |
| API Framework | FastAPI |
| OCR | Azure Document Intelligence |
| Data Validation | Pydantic |
| Language | Python 3.10+ |
Multi-Agent-AI-KYC-System/
β
βββ agent/ # Agent implementations
β βββ __init__.py
β βββ base_agent.py # Base class for all agents
β βββ aadhar_agent.py # Aadhaar verification (LangGraph FSM)
β βββ pan_agent.py # PAN verification (LangGraph FSM)
β βββ passport_agent.py # Passport verification
β βββ dl_agent.py # Driving License verification
β βββ form60_agent.py # Form 60 processing
β βββ kyc_agent.py # KYC Manager Agent
β βββ genral_query_agent.py # General insurance queries
β βββ pan_check_agent.py # PAN existence checker
β
βββ api/ # External API integrations
β βββ __init__.py
β βββ ocr_api.py # Azure Document Intelligence
β
βββ app/ # FastAPI application
β βββ __init__.py
β βββ main.py # FastAPI app definition
β βββ models.py # API request/response models
β βββ dependencies.py # Dependency injection
β βββ routers/
β βββ chat.py # Chat API endpoints
β
βββ config/ # Configuration management
β βββ __init__.py
β βββ config.py # Pydantic settings
β βββ settings.py # Environment settings
β
βββ data/ # Mock databases
β βββ database_nsdl.csv # PAN verification data
β βββ database_uidai.csv # Aadhaar verification data
β
βββ memory/ # Memory management
β βββ __init__.py
β βββ memory.py # Tiered memory system
β
βββ models/ # Pydantic models
β βββ __init__.py
β βββ intent.py # Intent classification models
β
βββ orchestrator/ # Main orchestration logic
β βββ __init__.py
β βββ router.py # Main Orchestrator
β
βββ prompts/ # LLM prompt templates
β βββ __init__.py
β βββ aadhar_prompts.py
β βββ pan_prompts.py
β βββ form60_prompts.py
β βββ greeting_prompts.py
β βββ orchestrate.py
β βββ prompts.py
β βββ prompts_short.py
β
βββ static/ # Web static files
β βββ css/
β β βββ style.css
β βββ js/
β βββ script.js
β βββ sw.js
β
βββ templates/ # Jinja2 templates
β βββ index.html # Main web interface
β
βββ tools/ # Verification tools
β βββ __init__.py
β βββ aadhar_tools.py # Aadhaar validation tools
β βββ pan_tools.py # PAN validation tools
β βββ ocr_tool.py # OCR utilities
β βββ ocr_pan_tool.py # PAN OCR processor
β βββ tools_definition.py # Tool definitions
β
βββ .env # Environment variables (create this)
βββ app.py # Flask web application
βββ llm.py # LLM client factory
βββ main_cli.py # CLI interface
βββ requirements.txt # Python dependencies
βββ run_api.py # FastAPI startup script
βββ run_web.py # Flask startup script
βββ state.py # State definitions (TypedDict)
βββ client_example.py # API client example
βββ webhook_example.py # Webhook integration example
βββ README.md # This file
Before running the application, ensure you have:
- Python 3.10 or higher
- pip (Python package manager)
- Git (for cloning the repository)
- Google Gemini API Key (for LLM)
- Mem0 API Key (for semantic memory)
- Redis Cloud instance (for working memory)
- Azure Document Intelligence (optional, for OCR)
- LangSmith API Key (optional, for tracing)
git clone https://github.com/ishaanparikh14/Multi-Agent-AI-KYC-System.git
cd Multi-Agent-AI-KYC-SystemWindows:
python -m venv .venv
.venv\Scripts\activatemacOS/Linux:
python3 -m venv .venv
source .venv/bin/activatepip install -r requirements.txtOr install packages individually:
pip install openai langgraph langsmith mem0ai fastapi uvicorn flask flask-socketio python-multipart pydantic pydantic-settings httpx aiohttp requests aiofiles pandas Pillow redis python-dotenv typing-extensions Jinja2Create a .env file in the project root:
# LLM Configuration (Google Gemini)
GEMINI_API_KEY=your_gemini_api_key_here
GEMINI_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai/
# Memory - Mem0 (Long-term semantic memory)
MEM0_API_KEY=your_mem0_api_key_here
# Memory - Redis (Short-term working memory)
REDIS_HOST=your_redis_host
REDIS_DB_NAME=your_redis_db_name
REDIS_PASSWORD=your_redis_password
# OCR - Azure Document Intelligence (Optional)
DOCUMENT_INTELLIGENCE_API_KEY=your_azure_di_key
DOCUMENT_INTELLIGENCE_ENDPOINT=https://your-resource.cognitiveservices.azure.com
# LangSmith - Observability (Optional)
LANGSMITH_TRACING=false
LANGSMITH_ENDPOINT=https://api.smith.langchain.com
LANGSMITH_API_KEY=your_langsmith_api_key
LANGSMITH_PROJECT=your_project_name
# ChromaDB (Optional - for vector storage)
CHROMA_TOKEN=your_chroma_token
CHROMA_TENANT=your_tenant
CHROMA_DATABASE=your_database
# Cohere (Optional - for embeddings)
COHERE_API_KEY=your_cohere_api_key| Service | Sign Up URL | Purpose |
|---|---|---|
| Google Gemini | Google AI Studio | LLM for conversations |
| Mem0 | Mem0.ai | Long-term memory |
| Redis Cloud | Redis Cloud | Short-term memory |
| Azure DI | Azure Portal | Document OCR |
| LangSmith | LangSmith | Tracing & debugging |
python run_web.pyThen open your browser to: http://localhost:5000
python main_cli.pypython run_api.pyOr with uvicorn directly:
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reloadAPI documentation available at: http://localhost:8000/docs
POST /api/chat
Send a message to the KYC assistant.
Request Body:
{
"message": "I want to start KYC verification",
"session_id": "optional-session-id"
}Response:
{
"response": "Welcome! I'm Siddhi from TATA AIA...",
"session_id": "abc123",
"active_workflow": "aadhaar",
"kyc_step": "awaiting_verification_method_selection",
"completed_workflows": []
}GET /api/session/{session_id}/status
Get current session status and progress.
Response:
{
"active_workflow": "aadhaar",
"kyc_step": "awaiting_otp",
"completed_workflows": ["pan"],
"aadhar_details": {},
"pan_details": {}
}User: Hi, I want to complete my KYC
Bot: Welcome! I'm Siddhi from TATA AIA Life Insurance.
Let's start with Aadhaar verification. Would you prefer:
1. eKYC (OTP-based verification)
2. DigiLocker
User: I'll go with eKYC
Bot: Great! Please enter your 12-digit Aadhaar number.
User: 123456789012
Bot: I've sent an OTP to your registered mobile. Please enter the 6-digit OTP.
User: 123456
Bot: β
Aadhaar verified successfully!
Name: John Doe
DOB: 01/01/1990
Is this information correct?
User: Now I want to verify my PAN
Bot: I see your name from Aadhaar is "John Doe" and DOB is "01/01/1990".
Please enter your PAN card number.
User: ABCDE1234F
Bot: β
PAN verified successfully against NSDL database!
The system implements a 3-tier memory architecture:
- Stores last 6 conversation turns
- Fast access for immediate context
- Automatic sliding window
- LLM-generated summaries of conversations
- Triggered after threshold of turns
- Preserves important facts and decisions
- Long-term user preferences
- Cross-session memory
- Agent-specific memories (Aadhaar agent, PAN agent, etc.)
1. Port Already in Use
# Windows
netstat -ano | findstr :5000
taskkill /PID <PID> /F
# Linux/macOS
lsof -i :5000
kill -9 <PID>2. Redis Connection Failed
- Verify Redis host, port (default: 10908), and password in
.env - Check if Redis Cloud firewall allows your IP
3. LLM API Rate Limits
- The system includes automatic retry with exponential backoff
- Consider upgrading your Gemini API quota
4. Import Errors
- Ensure you're running from the project root directory
- Activate the virtual environment before running
5. OCR Not Working
- Verify Azure Document Intelligence credentials
- Check endpoint URL format
Enable debug logging:
# In config/config.py
debug: bool = TrueEnable LangSmith tracing:
LANGSMITH_TRACING=trueContributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- Follow PEP 8 guidelines
- Use type hints
- Add docstrings to functions and classes
This project is licensed under the MIT License - see the LICENSE file for details.
Ishaan Parikh
- GitHub: @ishaanparikh14
- LangChain for the LangGraph framework
- Google for Gemini API
- Mem0 for the memory platform
- TATA AIA for the use case inspiration
Made with β€οΈ for the future of AI-powered KYC