A sophisticated AI-powered system for detecting spam messages, financial fraud, and phishing attempts in real-time
Designed to combat modern scams including Jamtara-style financial fraud schemes
π Quick Start β’ π Features β’ π οΈ Installation β’ π± Usage β’ π Roadmap
PhishShield is a comprehensive fraud detection system that combines machine learning with rule-based detection to identify harmful messages with high accuracy. The system specifically targets modern fraud schemes like those seen in Jamtara-style scams, providing real-time protection against:
- π³ Financial Fraud (Banking, UPI, loan scams)
- π£ Phishing Attempts (Credential theft, fake offers)
- π§ Traditional Spam (Unwanted marketing, malicious content)
- π± SMS Fraud (OTP theft, fake alerts)
- Neural Network: 97% accuracy on spam detection
- Rule-Based Engine: 30+ financial fraud patterns
- Pattern Recognition: URLs, money amounts, fraud keywords
- Real-Time Analysis: Instant fraud scoring
| Feature | Description | Coverage |
|---|---|---|
| π° Financial Keywords | Banking, UPI, card-related terms | 25+ patterns |
| π£ Phishing Indicators | Urgency, fake offers, social engineering | 20+ patterns |
| π URL Detection | Malicious domains, shortened links | 7+ patterns |
| β° Time Pressure | Urgency manipulation tactics | Real-time detection |
| πΈ Money Mentions | Currency amounts in messages | Multi-currency |
- Web-Based Dashboard: Streamlit-powered interface
- Binary Classification: Simple SPAM vs LEGITIMATE results
- Detailed Analysis: Risk breakdown and explanations
- Example Library: Pre-loaded test cases
- Safety Guidelines: Built-in fraud protection tips
- Python 3.7+ (Recommended: Python 3.9+)
- pip package manager
- Git (for cloning the repository)
# Clone the repository
git clone https://github.com/divinixx/PhishShield.git
cd PhishShield
# Run automated setup (Windows)
run_phishshield.bat
# Or use Python setup script
python setup_and_run.py# 1. Clone the repository
git clone https://github.com/divinixx/PhishShield.git
cd PhishShield
# 2. Install dependencies
pip install -r requirements.txt
# 3. Train the model (if not already trained)
python spam.py
# 4. Launch the application
streamlit run app.pystreamlit>=1.25.0
torch>=2.0.1
scikit-learn>=1.3.0
nltk>=3.8.1
pandas>=2.0.3
numpy>=1.24.3
joblib>=1.3.2-
Start the application:
streamlit run app.py
-
Open your browser to
http://localhost:8501 -
Analyze messages:
- Enter any message in the text area
- Click "π Analyze Message"
- View binary classification result
- Check detailed fraud analysis
# Test fraud detection with sample messages
python test_fraud_detection.py
# Verify system components
python -c "from app import load_model_components; print('β
All components loaded successfully!')"Input: "Dear customer, your debit card will be blocked within 2 hrs. Call 9876543210 to reactivate."
Output:
π¨ SPAM DETECTED (90% confidence)
π Fraud Risk Score: 100%
π³ Financial Keywords: debit card, reactivate
οΏ½ Suspicious Content: Phone number detected
β° Time Pressure: within 2 hrs
Input: "Hi! Are we still meeting for lunch tomorrow at 12pm?"
Output:
β
LEGITIMATE MESSAGE (95% confidence)
π Fraud Risk Score: 0%
βΉοΈ No suspicious patterns detected
π¨ Financial Fraud Examples
| Message Type | Example | Detection |
|---|---|---|
| Banking Fraud | "Your account is blocked. Call 9876543210 immediately!" | β SPAM |
| Loan Scam | "Pre-approved loan of βΉ5,00,000. Apply: http://scam-site.com" | β SPAM |
| Prize Scam | "You won iPhone! Pay βΉ99 shipping: http://fake-apple.com" | β SPAM |
| OTP Theft | "Share OTP 123456 to verify your account immediately" | β SPAM |
β Legitimate Message Examples
| Message Type | Example | Detection |
|---|---|---|
| Personal | "Hi! Are we meeting for lunch tomorrow?" | β LEGITIMATE |
| Business | "Team meeting at 3 PM in conference room B" | β LEGITIMATE |
| Appointments | "Doctor visit reminder: Friday at 3pm" | β LEGITIMATE |
| Delivery | "Package will be delivered between 10 AM - 2 PM" | β LEGITIMATE |
Input Message
β
βββββββββββββββββββ βββββββββββββββββββ
β Rule-Based β β ML Classificationβ
β Fraud Analysis β β (Neural Network) β
β β β β
β β’ Financial β β β’ TF-IDF β
β β’ Phishing β β β’ 5000 features β
β β’ URLs β β β’ 128β64β2 β
β β’ Keywords β β β’ PyTorch β
βββββββββββββββββββ βββββββββββββββββββ
β β
βββββββββββββββββββββββββββββββββββββββββββ
β Confidence Fusion β
β (Hybrid Decision Engine) β
βββββββββββββββββββββββββββββββββββββββββββ
β
Final Classification: SPAM / LEGITIMATE
- Backend: Python 3.9+
- ML Framework: PyTorch 2.0+
- Feature Engineering: Scikit-learn, NLTK
- Web Interface: Streamlit
- Data Processing: Pandas, NumPy
- Model Persistence: JobLib
PhishShield/
βββ π app.py # Main Streamlit application
βββ π§ spam.py # Model training script
βββ π spam.csv # Training dataset
βββ π§ requirements.txt # Python dependencies
βββ π run_phishshield.bat # Windows launcher
βββ π οΈ setup_and_run.py # Automated setup script
βββ π§ͺ test_fraud_detection.py # Testing utilities
βββ π README.md # Project documentation
βββ π€ Model Files/
βββ spam_classifier.pth # Trained neural network
βββ tfidf_vectorizer.pkl # TF-IDF vectorizer
βββ label_encoder.pkl # Label encoder
| Metric | Score | Description |
|---|---|---|
| Overall Accuracy | 97% | General spam detection accuracy |
| Financial Fraud Detection | 95%+ | Specialized fraud pattern detection |
| False Positive Rate | <3% | Legitimate messages marked as spam |
| Processing Speed | <100ms | Average analysis time per message |
| Model Size | ~2.5MB | Compact for deployment |
- Multi-layer Detection: ML + Rule-based validation
- Real-time Scoring: Instant risk assessment
- Pattern Recognition: Advanced fraud indicators
- Educational Alerts: Built-in safety guidelines
- β Never share: PIN, OTP, passwords via SMS
- β Always verify: Requests through official channels
- π¨ Report fraud: Suspicious messages to authorities
- π‘οΈ Stay informed: Keep updated on latest scam tactics
- π± Mobile App: React Native mobile application
- π Multi-language: Hindi, Tamil, Telugu support
- π Voice Analysis: Audio message fraud detection
- π Call Integration: Real-time call analysis
- π€ Advanced AI: Transformer-based models
- π Analytics Dashboard: Fraud trend analysis
- π API Integration: RESTful API for third-party apps
- π₯ Team Management: Multi-user support
- π Reporting: Advanced analytics and insights
- π Enterprise Security: Enhanced data protection
- β‘ High Performance: Scalable cloud deployment
This is a personal project created by divinixx for educational and research purposes.
β Star this repository if PhishShield helped protect you from fraud! β
π‘οΈ Protecting users from fraud, one message at a time π‘οΈ