mlops-2

Imbalanced Classification with MLflow Experiment Tracking 🚀

This project demonstrates how different machine learning models perform on an imbalanced binary classification problem, and how to track, compare, and manage experiments using MLflow.

📌 Problem Statement

Real-world datasets are often imbalanced, where one class heavily dominates the other. Traditional accuracy metrics can be misleading in such cases.
This project focuses on:

Handling class imbalance
Comparing multiple ML models
Evaluating performance using appropriate metrics
Tracking experiments using MLflow

🧪 Dataset

Synthetic dataset generated using make_classification
Samples: 1000
Class distribution: 90% Class 0, 10% Class 1
Features: 10 (2 informative, 8 redundant)

⚙️ Models Trained

Logistic Regression
Random Forest Classifier
XGBoost Classifier
XGBoost + SMOTETomek (Imbalance Handling)

📊 Evaluation Metrics

To properly assess imbalanced data, the following metrics were used:

Accuracy
Recall (Class 1 – Minority Class)
Recall (Class 0 – Majority Class)
Macro F1-Score

📈 Model Performance Summary

Model	Accuracy	Recall (Class 1)	Recall (Class 0)	Macro F1
Logistic Regression	0.9167	0.50	0.963	0.7498
Random Forest	0.9667	0.70	0.996	0.8947
XGBoost	0.9767	0.80	0.996	0.9299
XGBoost + SMOTETomek	0.9567	0.8333	0.970	0.8847

🔍 Key Insights

Accuracy alone is misleading for imbalanced datasets.
XGBoost achieved the best overall performance.
SMOTETomek improved minority class recall, making it suitable when false negatives are costly.
MLflow makes experiment comparison transparent and reproducible.

🧠 Experiment Tracking with MLflow

Logged parameters, metrics, and models
Compared multiple runs visually
Enabled reproducibility and model versioning

🛠️ Tech Stack

Python
Scikit-learn
XGBoost
Imbalanced-learn
MLflow
NumPy

▶️ How to Run

pip install -r requirements.txt
mlflow ui
python train.py

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
mlruns/1/models		mlruns/1/models
MLOPS.code-workspace		MLOPS.code-workspace
README.md		README.md
code.ipynb		code.ipynb
mlflow.db		mlflow.db

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mlops-2

Imbalanced Classification with MLflow Experiment Tracking 🚀

📌 Problem Statement

🧪 Dataset

⚙️ Models Trained

📊 Evaluation Metrics

📈 Model Performance Summary

🔍 Key Insights

🧠 Experiment Tracking with MLflow

🛠️ Tech Stack

▶️ How to Run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mlops-2

Imbalanced Classification with MLflow Experiment Tracking 🚀

📌 Problem Statement

🧪 Dataset

⚙️ Models Trained

📊 Evaluation Metrics

📈 Model Performance Summary

🔍 Key Insights

🧠 Experiment Tracking with MLflow

🛠️ Tech Stack

▶️ How to Run

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages