| Detail | |
|---|---|
| linkedin.com/in/nisarg-chasmawala | |
| 🐙 GitHub | github.com/nishu2402 |
- 👾 Authors
- 🧠 Project Summary
- 💡 Core Idea
- 📦 Dataset
- ⚙️ Complete Pipeline
- 🔧 Data Preprocessing Pipeline
- 🤖 Machine Learning Models
- 🧬 Deep Learning Models
- 📐 Metrics
- 🏆 Complete Performance
- 🥇 Why XGBoost Wins
- 🗂️ Repository Structure
- 🚀 Installation & Usage
- 🔮 Future Roadmap
⚠️ Disclaimer
This project implements an AI-driven anomaly-based Network Intrusion Detection System (NIDS) targeting Distributed Denial-of-Service (DDoS) attacks — classifying network flows as BENIGN or DDoS using both classical machine learning and deep learning architectures.
The complete CIC-DDoS2019 dataset of 225,745 network flow records with 84 predictive features was used with an 80/20 stratified split. Four ML classifiers and two DL architectures were trained and evaluated — producing 6 total models, with XGBoost achieving Accuracy=0.9999 and Precision=1.0000 (zero false positives).
| Metric | Value |
|---|---|
| 🗃️ Total Records | 225,745 network flow records |
| 🎯 Target Variable | Label — BENIGN (0) / DDoS (1) |
| 🔢 Predictive Features | 84 (85 total columns − Label) |
| 🔀 Train / Test Split | 80% / 20% (stratified) |
| 🤖 Total Models | 6 (4 ML + 2 DL) |
| 🏆 Best Accuracy | 0.9999 (XGBoost & Random Forest) |
| 🎯 Best Precision | 1.0000 (XGBoost — zero false positives) |
| 📈 Best ROC-AUC | 1.0000 (XGBoost & Random Forest) |
| ✅ Missing Values | Zero after preprocessing |
| ℹ️ Dataset Source | Kaggle — CIC-DDoS2019 |
DDoS attacks flood servers with traffic from botnets, making services unavailable to legitimate users. Traditional signature-based detection systems are:
- ⏱️ Reactive — only detect known attack signatures discovered after the fact
- 🔄 High maintenance — require constant signature database updates
- ❌ Zero-day blind — cannot detect novel or mutated attack patterns
- 📐 Dimensionally limited — struggle with 84-feature high-dimensional flow data
Our solution: Train ML and DL classifiers on 225,745 CIC-DDoS2019 network flow records to distinguish DDoS from benign traffic via learned anomaly patterns — achieving Accuracy=0.9999, Precision=1.0000, and ROC-AUC=1.0000 with XGBoost.
Network Flow (84 features) ──▶ Preprocessing ──▶ ML / DL Classifier ──▶ BENIGN / DDoS
Packet lengths Remove inf/NaN XGBoost ★ Accuracy=0.9999
Flow duration Drop ID columns Random Forest Precision=1.0000
Flag counts Label encode MLP / 1D-CNN ROC-AUC=1.0000
IAT statistics StandardScaler SVM / LR Zero false positives
Byte rates
| Property | Details |
|---|---|
| Name | CIC-DDoS2019 |
| File Used | Friday-WorkingHours-Afternoon-DDos.pcap_ISCX.csv |
| Source | Kaggle — CIC-DDoS2019 |
| Total Records | 225,745 network flow records |
| Total Columns | 85 (84 predictive features + 1 label) |
| Target | Label — BENIGN (→0) / DDoS (→1) |
| Missing Values | None after preprocessing |
| Train / Test Split | 80% / 20% stratified · seed=42 |
| Class | Label | Count | Notes |
|---|---|---|---|
| DDoS | S → 1 | Majority | Volumetric attack flows from CIC botnet simulation |
| BENIGN | B → 0 | Minority | Normal working-hours traffic baseline |
| Category | Example Features |
|---|---|
| Packet Length Stats | Pkt Len Min/Max/Mean/Std, Pkt Size Avg |
| Flow Duration | Flow Duration, Flow IAT Mean/Std/Max/Min |
| Byte Rates | Flow Bytes/s, Flow Pkts/s, Fwd/Bwd Pkts/s |
| TCP Flag Counts | FIN/SYN/RST/PSH/ACK/URG Flag Cnt |
| Header Features | Fwd Header Len, Bwd Header Len |
| Bulk/Segment Stats | Avg Fwd Segment Size, Subflow Fwd Bytes |
| Dropped ID Columns | Flow ID, Src IP, Dst IP, Timestamp — removed pre-training |
CIC-DDoS2019 CSV (225,745 records × 85 columns)
│
▼
┌─────────────────────────────┐
│ Data Preprocessing │ → remove inf · handle NaN · drop duplicates
│ Drop ID columns │ → Flow ID · Src IP · Dst IP · Timestamp removed
│ Label Encoding │ → BENIGN→0, DDoS→1
│ 80 / 20 Stratified Split │ → 180,596 train · 45,149 test · stratified label ratio
│ Feature Normalisation │ → StandardScaler (fitted on train only)
└─────────────┬───────────────┘
│
┌─────┴──────┐
▼ ▼
┌────────────┐ ┌────────────┐
│ 4 ML Models│ │ 2 DL Models│
│ LR · SVM │ │ MLP │
│ RF · XGB ★ │ │ 1D-CNN ★ │
└─────┬──────┘ └──────┬─────┘
│ │
└───────┬────────┘
▼
Evaluation: Accuracy · Precision · Recall · F1 · ROC-AUC
Confusion matrices + ROC curves per model
All models persisted via joblib (.pkl) / Keras (.h5)
Seven sequential preprocessing steps applied before any model training:
Replace all
np.inf/-np.infentries withNaNbefore imputation. CIC-DDoS2019 contains division-by-zero artifacts in rate-based features (e.g.,Flow Bytes/swhenFlow Duration=0).
Drop or impute
NaNentries. Post-infinite-removal the dataset is 100% complete.
Exact duplicate flow records removed to prevent data leakage and overfit on repeated entries.
Four columns removed — they carry no predictive signal and would cause data leakage:
DROP_COLS = ["Flow ID", " Source IP", " Destination IP", " Timestamp"]Binary target transformation:
label_map = {"BENIGN": 0, "DDoS": 1}
df["Label"] = df["Label"].map(label_map)80/20 split with
stratify=yandrandom_state=42— preserves the BENIGN/DDoS class ratio in both partitions.
StandardScalerfitted exclusively on training data — applied to both train and test sets. Prevents test-set statistics from leaking into the scaling parameters.
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train) # fit + transform
X_test_scaled = scaler.transform(X_test) # transform onlyLinear binary classifier computing P(DDoS) = σ(Σ βᵢxᵢ). Each coefficient βᵢ reflects the marginal DDoS signal of a single network flow feature. Fully interpretable — highest-weight features directly reveal which traffic patterns predict attack.
- Algorithm: L2-regularised maximum likelihood ·
solver='lbfgs'·max_iter=1000 - Strength: Interpretable · probabilistic output · fast inference
- Limitation: Linear decision boundary — cannot capture complex feature interactions (e.g.,
SYN_flag_high AND short_flow_duration AND high_pkt_rate) - Result: Accuracy=0.9989 · F1=0.9990 · ROC-AUC=0.9998
Finds the maximum-margin hyperplane separating benign and DDoS flows in the 84-dimensional scaled feature space. The margin maximisation inherently regularises the classifier — robust against sparse/noisy features.
- Algorithm:
LinearSVC·C=1.0·max_iter=5000 - Strength: Strong theoretical generalisation guarantees · effective in high-dimensional spaces
- Limitation: No probabilistic output (requires Platt scaling for ROC-AUC) · computationally expensive on 180k training records
- Result: Accuracy=0.9992 · F1=0.9993 · ROC-AUC=0.9999
Ensemble of 100 Decision Trees, each trained on a bootstrap sample with random feature subsets at each split. Aggregates by majority vote. Captures complex multi-way interactions between flow rate, packet length, and TCP flag features.
- Algorithm: Bagging ·
n_estimators=100·max_features='sqrt'·random_state=42 - Strength: Near-XGBoost performance · built-in feature importance · parallel training
- Result: Accuracy=0.9999 · Precision=0.9999 · F1=0.9999 · ROC-AUC=1.0000 ⭐
Gradient boosted trees with L1+L2 regularisation and column subsampling. Each tree corrects the residual classification error of the current ensemble via second-order gradient descent. Handles tabular flow data natively.
- Update Rule:
Fₘ(x) = Fₘ₋₁(x) + η · hₘ(x)wherehₘminimises regularised second-order objective - Hyperparameters:
n_estimators=100,learning_rate=0.1,max_depth=6,subsample=0.8 - Strength: Precision=1.0000 — zero false positives. Best overall model.
- Result: Accuracy=0.9999 · Precision=1.0000 · F1=0.9999 · ROC-AUC=1.0000 ⭐
Both DL models reshape the 84-feature input for sequence/spatial processing. Trained on the full 180,596-record training set with early stopping.
Fully connected feed-forward neural network with three hidden layers. Learns complex non-linear mappings from 84 network flow features to DDoS probability.
Input (84) → Dense(128, ReLU) → Dropout(0.3)
→ Dense(64, ReLU) → Dropout(0.2)
→ Dense(32, ReLU)
→ Dense(1, Sigmoid) ← binary classification output
- Optimiser: Adam ·
lr=0.001· Loss: Binary cross-entropy - Training:
batch_size=512·max_epochs=50·EarlyStopping(patience=5) - Result: Accuracy=0.9993 · Precision=0.9996 · Recall=0.9991 · F1=0.9994 · ROC-AUC=0.9999
Applies 1D convolutional filters across the 84 features treated as a sequence — extracting local feature correlation patterns (e.g., co-occurring high SYN + low ACK + short duration = DDoS signature).
Input (84, 1) → Conv1D(64, kernel=3, ReLU) → MaxPool1D(2)
→ Conv1D(32, kernel=3, ReLU) → GlobalMaxPool1D
→ Dense(64, ReLU) → Dropout(0.3)
→ Dense(1, Sigmoid)
- Optimiser: Adam ·
lr=0.001· Loss: Binary cross-entropy - Training:
batch_size=512·max_epochs=50·EarlyStopping(patience=5) - Result: Accuracy=0.9994 · Precision=0.9996 · Recall=0.9992 · F1=0.9994 · ROC-AUC=0.9999 ⭐
All 6 models evaluated on the identical 45,149-record stratified held-out test set using five complementary metrics:
| Metric | Formula | Interpretation |
|---|---|---|
| Accuracy | (TP+TN)/(TP+TN+FP+FN) |
Overall correct classification rate. Higher is better. |
| Precision | TP/(TP+FP) |
Of flows flagged as DDoS, % actually DDoS. Precision=1.0000 → zero false positives — no legitimate traffic incorrectly blocked. Higher is better. |
| Recall | TP/(TP+FN) |
Of actual DDoS flows, % correctly detected. High recall = no attacks bypass the system. Higher is better. |
| F1 Score | 2·(P·R)/(P+R) |
Harmonic mean of precision and recall. Primary metric for imbalanced BENIGN/DDoS distribution. Higher is better. |
| ROC-AUC | Area under ROC curve | Probability model ranks a random DDoS flow above a random benign flow. AUC=1.0000 → perfect separation. Higher is better. |
All models trained on 180,596 flows (80%) and evaluated on 45,149 stratified held-out test flows (20%).
⭐ XGBoost: Precision=1.0000 — zero false positives across the entire 45,149-flow test set. Recommended for deployment. ⭐ 1D-CNN: Best DL model — outperforms MLP on every metric; local feature correlation extraction pays off on network flow sequences.
Tree-Based Dominance: XGBoost and Random Forest achieve the highest overall performance because network flow data is tabular and highly non-linear. Ensemble tree methods draw tight, multi-dimensional decision boundaries around automated botnet behaviour patterns.
Flawless Precision: XGBoost's Precision=1.0000 means zero false positive alerts in testing — critical in production NIDS deployment where alert fatigue is a primary analyst burden. Legitimate traffic is never incorrectly flagged.
Near-Perfect DL: Both MLP and 1D-CNN exceed Accuracy=0.9993. The 1D-CNN's convolutional filters extract local co-occurrence patterns across feature groups (e.g., SYN_flag + short_IAT + high_byte_rate) that the MLP's fully connected layers treat independently.
Why not Logistic Regression?
→ Accuracy=0.9989 — lowest result overall.
→ Linear decision boundary cannot capture interaction effects:
high SYN count alone ≠ DDoS, but SYN + short IAT + high pkt rate = DDoS.
→ Treats all 84 features independently — misses combinatorial attack signatures.
Why not Linear SVM?
→ Strong at Accuracy=0.9992, F1=0.9993 — but still below tree ensembles.
→ Maximum-margin hyperplane is linear; DDoS decision boundary is non-linear.
→ Computationally expensive: 180k training records × 84 features.
→ No native probability output for calibrated risk thresholding.
Why not Random Forest over XGBoost?
→ RF achieves identical Accuracy and F1 to XGBoost — extremely competitive.
→ But RF Precision=0.9999 vs XGBoost Precision=1.0000.
→ That one false positive matters: in production NIDS, a single blocked
legitimate flow can disrupt services. XGBoost's zero-FP is decisive.
→ XGBoost's gradient residual correction is more targeted than RF's
bootstrap averaging on high-signal CIC-DDoS2019 features.
Why XGBoost wins (ML):
✅ Precision = 1.0000 — zero false positive alerts (critical for NIDS)
✅ Accuracy = 0.9999 — tied best with RF
✅ ROC-AUC = 1.0000 — perfect class separation
✅ F1 = 0.9999 — tied best with RF
✅ Handles tabular flow data natively — no scaling required for tree splits
✅ L1+L2 regularisation — prevents overfitting to specific botnet IPs/ports
✅ Gradient correction — each tree eliminates residual misclassifications
Why 1D-CNN wins (DL):
✅ Best DL Accuracy (0.9994 vs MLP 0.9993)
✅ Best DL F1 (0.9994 vs MLP 0.9994 — tied but better Recall)
✅ Local pattern extraction — Conv1D filters detect co-occurrence attack signatures
✅ Parameter efficiency — fewer weights than MLP's fully connected layers
✅ Feature locality — SYN+ACK+PSH flag cluster captured in one kernel pass
intelligent-ddos-detection-system/
│
├── 📓 DDoS_Attack_Detection_(ML_DL).ipynb ← Main notebook — full pipeline, EDA, training, evaluation
├── 🐍 ddos_attack_detection_(ml_dl).py ← Complete standalone Python script
│
└── 💾 saved_models/ ← All 6 trained models + scaler
├── lr_model.pkl ← Logistic Regression · Acc=0.9989
├── svm_model.pkl ← Linear SVM · Acc=0.9992
├── rf_model.pkl ← Random Forest · Acc=0.9999 · ROC=1.0000
├── xgb_model.pkl ← XGBoost ★ · Prec=1.0000 · Best ML
├── mlp_model.h5 ← MLP (DNN) · Acc=0.9993
├── cnn_model.h5 ← 1D-CNN ★ · Best DL
└── scaler.pkl ← StandardScaler (fitted on train only)
Python 3.8+
pipgit clone https://github.com/nishu2402/intelligent-ddos-detection-system.git
cd intelligent-ddos-detection-systempip install pandas numpy matplotlib seaborn scikit-learn xgboost tensorflow joblibOr install all at once:
pip install -r requirements.txt📋 Full requirements.txt
pandas>=1.5.0
numpy>=1.23.0
matplotlib>=3.6.0
seaborn>=0.12.0
scikit-learn>=1.3.0
xgboost>=1.7.0
tensorflow>=2.10.0
joblib>=1.2.0
jupyter>=1.0.0
Download
Friday-WorkingHours-Afternoon-DDos.pcap_ISCX.csvfrom Kaggle — CIC-DDoS2019 and place it in the project root directory.
jupyter notebook DDoS_Attack_Detection_(ML_DL).ipynbpython ddos_attack_detection_(ml_dl).pyimport joblib
import numpy as np
import tensorflow as tf
# ── Load scaler + ML models ───────────────────────────────────────
scaler = joblib.load("saved_models/scaler.pkl")
xgb_model = joblib.load("saved_models/xgb_model.pkl") # ← Best ML
rf_model = joblib.load("saved_models/rf_model.pkl")
lr_model = joblib.load("saved_models/lr_model.pkl")
svm_model = joblib.load("saved_models/svm_model.pkl")
# ── Load DL models ────────────────────────────────────────────────
cnn_model = tf.keras.models.load_model("saved_models/cnn_model.h5") # ← Best DL
mlp_model = tf.keras.models.load_model("saved_models/mlp_model.h5")
# ── Inference example ─────────────────────────────────────────────
# X_new: shape (n_samples, 84) — preprocessed and scaled flow features
X_new_raw = np.random.rand(1, 84) # replace with real flow features
X_new_scaled = scaler.transform(X_new_raw)
# XGBoost prediction (recommended)
xgb_pred = xgb_model.predict(X_new_scaled)
xgb_proba = xgb_model.predict_proba(X_new_scaled)
label = "🚨 DDoS DETECTED" if xgb_pred[0] == 1 else "✅ BENIGN"
print(f"XGBoost Classification: {label}")
print(f"DDoS probability: {xgb_proba[0][1]:.4f}")
# → XGBoost Classification: 🚨 DDoS DETECTED
# → DDoS probability: 0.9997
# 1D-CNN prediction
X_cnn = X_new_scaled.reshape(1, 84, 1)
cnn_prob = cnn_model.predict(X_cnn, verbose=0)[0][0]
print(f"1D-CNN DDoS probability: {cnn_prob:.4f}")| Priority | Improvement | Expected Impact |
|---|---|---|
| 🔴 HIGH | Hyperparameter Tuning — GridSearchCV / Optuna on XGBoost + 1D-CNN | Further reduce the marginal miss rate on edge-case DDoS flows |
| 🔴 HIGH | Real-Time Streaming Detection — integrate with Zeek/Suricata via Kafka | Production-grade live network traffic classification at wire speed |
| 🟠 MED | Adversarial Robustness Testing — evaluate against feature-space evasion attacks | Hardens model against adversarial DDoS flows crafted to mimic benign patterns |
| 🟠 MED | Federated Learning — distributed training across network segments | Enables privacy-preserving NIDS without centralising raw traffic captures |
| 🟠 MED | SHAP Explainability — per-flow feature attribution for analyst triage | Surfaces why a flow was classified as DDoS — critical for incident response |
| 🟡 LOW | Edge Deployment — Docker + Kubernetes containerised inference API | Deployable at network edge nodes with low-latency detection requirements |
| 🟡 LOW | Multi-Attack Classification — extend from binary to multi-class (UDP flood / SYN flood / HTTP flood) | Granular attack-type identification for targeted mitigation responses |
| 🟡 LOW | Transformer Architecture — self-attention over 84-feature flow sequences | Captures global feature dependencies missed by local CNN kernels |
This project was developed as a Master's Research Assignment for the module Applied Machine Learning (CMP7239) at Birmingham City University, Academic Year 2025–26, under the supervision of Dr Mohamed Ihmeida.
- This repository is intended solely for academic and educational purposes.
- The DDoS detection models should not be used to perform or facilitate unauthorised network testing, penetration testing, or attack traffic generation.
- All network flow data is sourced from the publicly available CIC-DDoS2019 dataset — a controlled lab environment capture by the Canadian Institute for Cybersecurity.
- The trained model files (
.pkl,.h5) are provided for reproducibility purposes only. Always validate model behaviour against your own network baseline before any production deployment.
⭐ If this project helped you, please give it a star on GitHub!