Skip to content

Ojasp21/finTwin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FinSight AI — Cognitive Credit Intelligence Platform

An end-to-end, multi-tier AI system for real-time financial signal ingestion, behavioural modelling, predictive risk simulation, and autonomous credit decisioning — powered by a living Digital Twin of every borrower.


Architecture

System Architecture

Figure 1: Full 10-tier architecture showing the data flow from raw financial signals through event processing, behavioural modelling, LLM reasoning, anomaly detection, credit decisioning, and the audit dashboard. External dependencies (Database, LLM API) connect at the reasoning layer.


Table of Contents


Overview

FinSight AI is a production-grade, multi-tier cognitive credit intelligence platform that ingests raw financial signals from multiple sources, transforms them into a rich behavioural profile (the Digital Twin), and uses AI-driven reasoning to make real-time lending decisions and proactive interventions.

The system is designed around the principle of continuous financial awareness — every UPI transaction, SMS alert, bank statement, and EMI record feeds a living model of the user's financial health, enabling decisions that go far beyond a static credit score.

Core capabilities:

  • Real-time multi-source financial signal ingestion and normalisation
  • Behavioural feature engineering (income stability, EMI burden, spending volatility, discretionary ratio)
  • Living Digital Twin per user with versioned financial DNA embeddings
  • LLM-powered risk narrative generation and contradiction detection
  • Monte Carlo simulation and stress testing for predictive risk
  • Fraud detection, anomaly detection, and synthetic identity detection
  • Autonomous credit decisioning with cognitive override logic
  • Proactive intervention via EMI risk alerts, financial advice, and micro-loan offers
  • Full audit trail with simulation replay and what-if analysis

System Tiers

Tier 1 — Signal Ingestion Engine

Responsible for collecting, deduplicating, and normalising raw financial data from all input sources into a unified canonical format.

Input sources:

  • UPI Logs
  • SMS Transaction Alerts
  • Bank Transactions API
  • Voice Transcript Input
  • Open Banking Feed
  • EMI Schedule Records

Components:

  • Data Normaliser — Standardises heterogeneous formats into a common schema
  • Canonical Schema Generator — Produces typed, validated financial event objects
  • Deduplication Module — Removes duplicate signals across overlapping sources
  • Event Queue (Kafka/Redis) — Durable, ordered queue for downstream processing

Tier 2 — Event Stream Processor

Consumes the unified event queue and enriches, classifies, and aggregates events into time-windowed summaries.

Components:

  • Financial Event Classifier — Tags events by category (income, expense, EMI, transfer, etc.)
  • Merchant NLP Model — Extracts merchant intent and category from raw descriptions
  • Event Enrichment Module — Attaches metadata, geolocation signals, and category tags
  • Sliding Window Aggregator — Produces rolling summaries at 7-day, 30-day, and 90-day windows for downstream feature engines

Tier 3 & 4 — Behaviour Engine & Digital Twin

The core of the platform. Transforms enriched events into a rich behavioural feature set and maintains a persistent, versioned Digital Twin per user.

Behavioural Feature Engine:

  • Income Stability Score
  • EMI Burden Ratio
  • Spending Volatility Calculator
  • Savings Rate
  • Discretionary Ratio

Peer Cohort Benchmark Engine:

  • Positions each user relative to behavioural peer cohorts for contextualised risk scoring

Digital Twin State Store:

  • Risk Trend Time Series
  • Liquidity Health Score
  • Financial Persona classification
  • Peer Deviation Score
  • Credit Dependency Score
  • DNA Embedding (32-dimensional financial fingerprint)
  • Twin Version History — full audit trail of state changes over time

Tier 5, 6 & 9 — Reasoning, Risk & Anomaly

Three parallel engines that operate on the Digital Twin data to produce intelligence signals for the Decision Engine.

LLM Reasoning Agent:

  • Risk Narrative — Human-readable explanation of the user's current risk posture
  • Behaviour Summary — Condensed digest of recent behavioural shifts
  • Intent Signals — Inferred financial intent (e.g., planning a large purchase, financial stress)
  • Contradiction Detection Module — Flags inconsistencies between stated income and observed cashflow

Predictive Risk Simulation Engine:

  • Monte Carlo Simulator — Runs probabilistic forward projections of repayment capacity
  • Stress Test Generator — Models performance under adverse income and expense scenarios

Anomaly Detection Engine:

  • Fraud Detection Model — Real-time transaction-level fraud scoring
  • Behaviour Deviation Model — Flags sudden lifestyle or spending pattern shifts
  • Seam Signal Analyser — Detects stitched or fabricated financial histories
  • Synthetic Identity Detector — Identifies patterns consistent with synthetic identity fraud

Tier 7 & 8 — Decision & Action Engine

The autonomous decisioning layer that translates intelligence signals into credit decisions and proactive interventions.

Cognitive Credit Engine:

  • Loan Eligibility Calculator — Determines eligibility based on risk scores and Digital Twin state
  • Interest Rate Adjuster — Personalises rates based on behavioural risk profile
  • Behaviour Override Logic — Allows manual or rule-based overrides with full audit logging

Proactive Intervention Agent:

  • EMI Risk Alert — Early warning when repayment risk is detected
  • Overspending Alert Generator — Notifies users of discretionary spend anomalies
  • Financial Advice Generator — Personalised, context-aware nudges and recommendations
  • Micro Loan Offer Generator — Tailored short-term credit offers triggered by need signals

Tier 10 — Audit & Simulation Dashboard

Full observability and governance layer for compliance, model monitoring, and what-if scenario analysis.

Components:

  • Credit Decision Log — Immutable record of every decision with full feature attribution
  • Risk Projection Graph — Visual risk trajectory over time per user
  • Digital Twin Timeline Viewer — Step-through replay of how a user's twin evolved
  • Intervention History — Log of all proactive actions taken and outcomes
  • Anomaly Heatmap — Spatial and temporal view of detected anomalies across the user base
  • What-if Simulation Panel — Analyst tool to replay decisions with modified inputs
  • Audit Report Generator — Automated compliance report generation

Key Components

Component Purpose
Event Queue (Kafka/Redis) Durable, ordered stream backbone
Digital Twin State Store Per-user versioned behavioural model
LLM Reasoning Agent Narrative generation and contradiction detection
Monte Carlo Simulator Probabilistic repayment capacity forecasting
Cognitive Credit Engine Final credit decisioning with override logic
Audit Report Generator Regulatory compliance and explainability

Data Flow

Multi-Source Financial Signals
        │
        ▼
Tier 1: Signal Ingestion Engine
  (normalise → deduplicate → enqueue)
        │
        ▼
Tier 2: Event Stream Processor
  (classify → enrich → aggregate windows)
        │
        ▼
Tier 3 & 4: Behaviour Engine & Digital Twin
  (feature engineering → Digital Twin update)
        │
        ├──────────────────────────────┐
        ▼                              ▼
Tier 5/6: LLM Reasoning         Tier 9: Anomaly Detection
  + Risk Simulation                (fraud, deviation, synthetic ID)
        │                              │
        └──────────────┬───────────────┘
                       ▼
            Tier 7 & 8: Decision & Action Engine
              (credit decision → intervention)
                       │
                       ▼
            Tier 10: Audit & Simulation Dashboard

External Services

Service Role
Database Persistent storage for Digital Twin state, decision logs, and audit trails
LLM API Powers the Reasoning Agent for narrative generation, summarisation, and contradiction detection

The LLM API integration point sits between the Digital Twin layer and the Reasoning Agent. All LLM calls are logged to the audit layer for full explainability.


Getting Started

Prerequisites

  • Docker & Docker Compose
  • Node.js >= 20
  • Python >= 3.11
  • Kafka (or managed equivalent)
  • Redis
  • PostgreSQL (or compatible)

Installation

git clone https://github.com/your-org/finsight-ai.git
cd finsight-ai
cp .env.example .env
docker-compose up -d

Run services individually

# Signal Ingestion Engine
cd services/ingestion && npm install && npm start

# Event Stream Processor
cd services/stream-processor && npm install && npm start

# Behaviour Engine
cd services/behaviour-engine && pip install -r requirements.txt && python main.py

# LLM Reasoning Agent
cd services/reasoning-agent && pip install -r requirements.txt && python main.py

# Decision Engine
cd services/decision-engine && npm install && npm start

Configuration

Variable Default Description
KAFKA_BROKER_URL localhost:9092 Kafka broker connection string
REDIS_URL redis://localhost:6379 Redis connection for event queue
DATABASE_URL postgres://localhost:5432/finsight Primary database
LLM_API_KEY API key for LLM provider
LLM_MODEL gpt-4o Model to use for reasoning agent
TWIN_EMBEDDING_DIM 32 Dimensionality of DNA embedding
WINDOW_DAYS 7,30,90 Sliding window sizes for aggregator
MONTE_CARLO_ITERATIONS 1000 Simulation iterations per risk assessment
AUDIT_LOG_ENABLED true Enable immutable audit logging

Tech Stack

  • Event Streaming: Apache Kafka, Redis Streams
  • Feature Engineering: Python, Pandas, NumPy
  • LLM Integration: OpenAI / Anthropic API (pluggable)
  • Fraud & Anomaly Models: Scikit-learn, XGBoost, custom neural models
  • Decision Engine: Node.js / TypeScript
  • State Store: PostgreSQL with JSONB for Digital Twin versioning
  • Dashboard: React, Recharts
  • Infrastructure: Docker, Kubernetes-ready

Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/your-feature
  3. Commit with conventional commits: git commit -m 'feat: add stress test generator'
  4. Push: git push origin feature/your-feature
  5. Open a Pull Request against main

Please read CONTRIBUTING.md for code style, testing requirements, and the PR review process.


License

MIT © FinSight AI Team

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors