Skip to content

deepti-96/PhishNChips-Distributed-Phishing-Intelligence-Network

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PhishNChips - Distributed Phishing Intelligence Platform

License: MIT Python 3.11 Elasticsearch 8.11

Real-time global phishing detection powered by distributed databases, ML risk scoring, and cross-region replication.

Inspiration

In today's digital world, phishing attacks evolve faster than traditional security systems can respond. Centralized databases create single points of failure and geographic blind spots. We built PhishNChips to demonstrate how distributed databases can provide global, real-time threat intelligence.

What Makes This Different?

Unlike traditional phishing detectors relying on centralized storage:

  • No single point of failure
  • Automatic cross-region replication
  • Distributed query execution
  • Horizontally scalable architecture
  • Resilient to node failure

PhishNChips demonstrates real distributed systems resilience, not just ML-based classification.

What it does

PhishNChips creates a geo-distributed network where:

  • Browser extensions detect suspicious URLs using ML and heuristics
  • Distributed Elasticsearch cluster (3-node cluster simulating US/EU/Asia regions locally via Docker) stores and replicates threat data
  • Real-time dashboards visualize global phishing trends
  • Automated testing proves fault tolerance and scalability

Demo Video

PhishNChips Demo

20-second demo showing the distributed phishing detection system in action

How it is built

Tech Stack:

  • Backend: FastAPI (Python) with ML-based risk scoring
  • Database: Elasticsearch 8.11 (3-node distributed cluster)
  • Frontend: Browser extension + Kibana dashboards
  • Infrastructure: Docker Compose for local simulation

Key Features:

  • Distributed 3-Node Cluster - Simulates global deployment locally
  • ML Risk Scoring - Neural network + heuristic analysis
  • Real-time Visualization - Kibana dashboards for threat monitoring
  • Automated Testing - Fault tolerance and scalability validation
  • Cross-region Replication - Global threat dissemination

Quick Start

# Clone and setup
git clone https://github.com/deepti-96/PhishNChips-Distributed-Phishing-Intelligence-Network.git
cd PhishNChips

# Start the distributed cluster
make quick-start

# Access points
# Kibana Dashboard: http://localhost:5601
# API Docs: http://localhost:8000/docs
# Elasticsearch: http://localhost:9200

Architecture

Browser Extensions (US, EU, ASIA)
        ↓ HTTP POST
    FastAPI Service
        ↓ Index
Elasticsearch Cluster (3-node distributed cluster)
   ↓ Replication
Kibana Dashboard

Distributed Concepts Demonstrated:

  • Sharding – Data partitioned across nodes
  • Replication (RF=2) – Survives single-node failure
  • Distributed Queries – Cross-node aggregations
  • Automatic Rebalancing – Replica shard reassignment after node failure
  • Horizontal Scalability – Dynamic node addition

Testing & Validation

  • Fault Tolerance: Node termination triggers shard reallocation; cluster recovers in <60s without data loss
  • Scalability: Validated with 100K–1M indexed records
  • Performance: <200ms ingestion latency under test load
  • ML Accuracy: 90%+ evaluated on labeled phishing dataset

Challenges we ran into

  • Simulating geo-distribution on a single machine
  • Balancing ML model accuracy vs. real-time performance
  • Implementing proper cross-region data consistency
  • Debugging distributed system failures

Accomplishments

  • Working distributed phishing detection network
  • Real-time threat visualization across regions
  • Automated fault tolerance testing
  • ML-powered risk scoring with 90%+ accuracy
  • Production-ready API with comprehensive docs

What we learned

  • Distributed systems design patterns
  • Elasticsearch cluster management
  • ML model deployment in production
  • Importance of automated testing for complex systems
  • Balancing consistency vs. availability in distributed databases

What's next

  • Deploy to cloud (AWS/Azure multi-region)
  • Add more ML models for advanced threat detection
  • Integrate with existing security tools
  • Real-time alerting system
  • Mobile app companion

About

Real-time distributed phishing detection - browser extension + FastAPI backend + 3-node Elasticsearch cluster with ML risk scoring, cross-region replication, and automated fault tolerance testing.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors