Laboratory 1 for the AIC Course

This repository contains all the materials, scripts, and documentation for Laboratory 1 of the AI and Cybersecurity course.

Overview

This lab builds a complete intrusion detection pipeline on a curated subset of the CICIDS2017 dataset using Feed Forward Neural Networks (FFNN) in PyTorch.

The laboratory is structured into six progressive tasks that comprehensively cover the intrusion detection pipeline:

Task 1: Data cleaning, stratified splits, outlier inspection, scaling comparison (Standard vs Robust).
Task 2: Shallow FFNN (single hidden layer) with neuron sweep and activation (Linear vs ReLU).
Task 3: Feature bias analysis (Destination Port), port substitution experiment, feature removal impact.
Task 4: Class imbalance mitigation via class‐weighted CrossEntropy.
Task 5: Deep architectures, batch size impact, optimizer comparison (SGD / Momentum / AdamW).
Task 6: Overfitting and regularization (Dropout, BatchNorm, Weight Decay) on deeper models.

Repository Structure

Laboratory1/
├── lab/            # Data, notebooks and support material
├── report/         # LaTeX source files for the lab report
├── resources/      # Additional resources (e.g., links, PDFs, images)
└── README.md       # This file

Note

The detailed lab report, including all experimental results and analysis, can be found here.

Lab Objectives & Requirements

Objectives:

Understand preprocessing choices (scaling, outlier retention).
Evaluate architectural depth vs minority class detection.
Quantify bias induced by a single feature (Destination Port).
Mitigate class imbalance using weighted loss.
Compare optimizers and batch sizes for convergence/generalization.
Assess regularization techniques on tabular intrusion data.

Requirements:

Python 3.10+
PyTorch, scikit-learn, numpy, pandas, matplotlib, seaborn
Dataset file: lab/data/dataset_lab_1.csv

Quick Start

Clone:
```
git clone <repo_url>
cd Laboratory1
```
Create environment (example with venv):
```
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```
(Create requirements.txt if missing; minimal list: torch torchvision torchaudio scikit-learn pandas numpy seaborn matplotlib)

Run notebook:

jupyter notebook lab/notebooks/Lab1_FFNN.ipynb

Results (plots, metrics) saved under lab/results/images/<task>_plots/.

Data

Place dataset_lab_1.csv in lab/data/.
No automatic download is performed (course-provided subset).

Reproducing Experiments

Set random seed (already fixed to 42 in notebook).
To switch scaler: change X_train_use = X_train_std to robust variant.
To rerun port bias test: execute Task 3 cells after initial training.

Results Summary (High-level)

Best shallow (ReLU, 64 neurons) balanced macro F1.
Deep 3-layer [32,16,8] + AdamW gave strong trade-off.
Weight decay (1e-4) sufficed; heavy Dropout/BatchNorm harmed minority recall.
Port feature induced spurious correlation—removal reduced PortScan shortcuts.

Authors

Name	GitHub	LinkedIn	Email
Andrea Botticella
Elia Innocenti
Simone Romano

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Laboratory 1 for the AIC Course

Overview

Repository Structure

Lab Objectives & Requirements

Objectives:

Requirements:

Quick Start

Data

Reproducing Experiments

Results Summary (High-level)

Authors

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Laboratory 1 for the AIC Course

Overview

Repository Structure

Lab Objectives & Requirements

Objectives:

Requirements:

Quick Start

Data

Reproducing Experiments

Results Summary (High-level)

Authors