An end-to-end, multi-tier AI system for real-time financial signal ingestion, behavioural modelling, predictive risk simulation, and autonomous credit decisioning — powered by a living Digital Twin of every borrower.
Figure 1: Full 10-tier architecture showing the data flow from raw financial signals through event processing, behavioural modelling, LLM reasoning, anomaly detection, credit decisioning, and the audit dashboard. External dependencies (Database, LLM API) connect at the reasoning layer.
- Overview
- Architecture
- System Tiers
- Key Components
- Data Flow
- External Services
- Getting Started
- Configuration
- Tech Stack
- Contributing
- License
FinSight AI is a production-grade, multi-tier cognitive credit intelligence platform that ingests raw financial signals from multiple sources, transforms them into a rich behavioural profile (the Digital Twin), and uses AI-driven reasoning to make real-time lending decisions and proactive interventions.
The system is designed around the principle of continuous financial awareness — every UPI transaction, SMS alert, bank statement, and EMI record feeds a living model of the user's financial health, enabling decisions that go far beyond a static credit score.
Core capabilities:
- Real-time multi-source financial signal ingestion and normalisation
- Behavioural feature engineering (income stability, EMI burden, spending volatility, discretionary ratio)
- Living Digital Twin per user with versioned financial DNA embeddings
- LLM-powered risk narrative generation and contradiction detection
- Monte Carlo simulation and stress testing for predictive risk
- Fraud detection, anomaly detection, and synthetic identity detection
- Autonomous credit decisioning with cognitive override logic
- Proactive intervention via EMI risk alerts, financial advice, and micro-loan offers
- Full audit trail with simulation replay and what-if analysis
Responsible for collecting, deduplicating, and normalising raw financial data from all input sources into a unified canonical format.
Input sources:
- UPI Logs
- SMS Transaction Alerts
- Bank Transactions API
- Voice Transcript Input
- Open Banking Feed
- EMI Schedule Records
Components:
- Data Normaliser — Standardises heterogeneous formats into a common schema
- Canonical Schema Generator — Produces typed, validated financial event objects
- Deduplication Module — Removes duplicate signals across overlapping sources
- Event Queue (Kafka/Redis) — Durable, ordered queue for downstream processing
Consumes the unified event queue and enriches, classifies, and aggregates events into time-windowed summaries.
Components:
- Financial Event Classifier — Tags events by category (income, expense, EMI, transfer, etc.)
- Merchant NLP Model — Extracts merchant intent and category from raw descriptions
- Event Enrichment Module — Attaches metadata, geolocation signals, and category tags
- Sliding Window Aggregator — Produces rolling summaries at 7-day, 30-day, and 90-day windows for downstream feature engines
The core of the platform. Transforms enriched events into a rich behavioural feature set and maintains a persistent, versioned Digital Twin per user.
Behavioural Feature Engine:
- Income Stability Score
- EMI Burden Ratio
- Spending Volatility Calculator
- Savings Rate
- Discretionary Ratio
Peer Cohort Benchmark Engine:
- Positions each user relative to behavioural peer cohorts for contextualised risk scoring
Digital Twin State Store:
- Risk Trend Time Series
- Liquidity Health Score
- Financial Persona classification
- Peer Deviation Score
- Credit Dependency Score
- DNA Embedding (32-dimensional financial fingerprint)
- Twin Version History — full audit trail of state changes over time
Three parallel engines that operate on the Digital Twin data to produce intelligence signals for the Decision Engine.
LLM Reasoning Agent:
- Risk Narrative — Human-readable explanation of the user's current risk posture
- Behaviour Summary — Condensed digest of recent behavioural shifts
- Intent Signals — Inferred financial intent (e.g., planning a large purchase, financial stress)
- Contradiction Detection Module — Flags inconsistencies between stated income and observed cashflow
Predictive Risk Simulation Engine:
- Monte Carlo Simulator — Runs probabilistic forward projections of repayment capacity
- Stress Test Generator — Models performance under adverse income and expense scenarios
Anomaly Detection Engine:
- Fraud Detection Model — Real-time transaction-level fraud scoring
- Behaviour Deviation Model — Flags sudden lifestyle or spending pattern shifts
- Seam Signal Analyser — Detects stitched or fabricated financial histories
- Synthetic Identity Detector — Identifies patterns consistent with synthetic identity fraud
The autonomous decisioning layer that translates intelligence signals into credit decisions and proactive interventions.
Cognitive Credit Engine:
- Loan Eligibility Calculator — Determines eligibility based on risk scores and Digital Twin state
- Interest Rate Adjuster — Personalises rates based on behavioural risk profile
- Behaviour Override Logic — Allows manual or rule-based overrides with full audit logging
Proactive Intervention Agent:
- EMI Risk Alert — Early warning when repayment risk is detected
- Overspending Alert Generator — Notifies users of discretionary spend anomalies
- Financial Advice Generator — Personalised, context-aware nudges and recommendations
- Micro Loan Offer Generator — Tailored short-term credit offers triggered by need signals
Full observability and governance layer for compliance, model monitoring, and what-if scenario analysis.
Components:
- Credit Decision Log — Immutable record of every decision with full feature attribution
- Risk Projection Graph — Visual risk trajectory over time per user
- Digital Twin Timeline Viewer — Step-through replay of how a user's twin evolved
- Intervention History — Log of all proactive actions taken and outcomes
- Anomaly Heatmap — Spatial and temporal view of detected anomalies across the user base
- What-if Simulation Panel — Analyst tool to replay decisions with modified inputs
- Audit Report Generator — Automated compliance report generation
| Component | Purpose |
|---|---|
| Event Queue (Kafka/Redis) | Durable, ordered stream backbone |
| Digital Twin State Store | Per-user versioned behavioural model |
| LLM Reasoning Agent | Narrative generation and contradiction detection |
| Monte Carlo Simulator | Probabilistic repayment capacity forecasting |
| Cognitive Credit Engine | Final credit decisioning with override logic |
| Audit Report Generator | Regulatory compliance and explainability |
Multi-Source Financial Signals
│
▼
Tier 1: Signal Ingestion Engine
(normalise → deduplicate → enqueue)
│
▼
Tier 2: Event Stream Processor
(classify → enrich → aggregate windows)
│
▼
Tier 3 & 4: Behaviour Engine & Digital Twin
(feature engineering → Digital Twin update)
│
├──────────────────────────────┐
▼ ▼
Tier 5/6: LLM Reasoning Tier 9: Anomaly Detection
+ Risk Simulation (fraud, deviation, synthetic ID)
│ │
└──────────────┬───────────────┘
▼
Tier 7 & 8: Decision & Action Engine
(credit decision → intervention)
│
▼
Tier 10: Audit & Simulation Dashboard
| Service | Role |
|---|---|
| Database | Persistent storage for Digital Twin state, decision logs, and audit trails |
| LLM API | Powers the Reasoning Agent for narrative generation, summarisation, and contradiction detection |
The LLM API integration point sits between the Digital Twin layer and the Reasoning Agent. All LLM calls are logged to the audit layer for full explainability.
- Docker & Docker Compose
- Node.js >= 20
- Python >= 3.11
- Kafka (or managed equivalent)
- Redis
- PostgreSQL (or compatible)
git clone https://github.com/your-org/finsight-ai.git
cd finsight-ai
cp .env.example .env
docker-compose up -d# Signal Ingestion Engine
cd services/ingestion && npm install && npm start
# Event Stream Processor
cd services/stream-processor && npm install && npm start
# Behaviour Engine
cd services/behaviour-engine && pip install -r requirements.txt && python main.py
# LLM Reasoning Agent
cd services/reasoning-agent && pip install -r requirements.txt && python main.py
# Decision Engine
cd services/decision-engine && npm install && npm start| Variable | Default | Description |
|---|---|---|
KAFKA_BROKER_URL |
localhost:9092 |
Kafka broker connection string |
REDIS_URL |
redis://localhost:6379 |
Redis connection for event queue |
DATABASE_URL |
postgres://localhost:5432/finsight |
Primary database |
LLM_API_KEY |
— | API key for LLM provider |
LLM_MODEL |
gpt-4o |
Model to use for reasoning agent |
TWIN_EMBEDDING_DIM |
32 |
Dimensionality of DNA embedding |
WINDOW_DAYS |
7,30,90 |
Sliding window sizes for aggregator |
MONTE_CARLO_ITERATIONS |
1000 |
Simulation iterations per risk assessment |
AUDIT_LOG_ENABLED |
true |
Enable immutable audit logging |
- Event Streaming: Apache Kafka, Redis Streams
- Feature Engineering: Python, Pandas, NumPy
- LLM Integration: OpenAI / Anthropic API (pluggable)
- Fraud & Anomaly Models: Scikit-learn, XGBoost, custom neural models
- Decision Engine: Node.js / TypeScript
- State Store: PostgreSQL with JSONB for Digital Twin versioning
- Dashboard: React, Recharts
- Infrastructure: Docker, Kubernetes-ready
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature - Commit with conventional commits:
git commit -m 'feat: add stress test generator' - Push:
git push origin feature/your-feature - Open a Pull Request against
main
Please read CONTRIBUTING.md for code style, testing requirements, and the PR review process.
MIT © FinSight AI Team
