📊 Credit Mix Classification – End-to-End Machine Learning Project

📌 Project Overview

Credit Mix Classification is an end-to-end machine learning project that predicts a customer’s Credit Mix category — Good, Standard, or Bad — using structured financial and behavioral data. The project covers the complete ML lifecycle, including data cleaning, exploratory data analysis (EDA), feature engineering, outlier handling, model-specific preprocessing pipelines, and comparative evaluation of multiple ensemble models.

🗂️ Project Structure

├── CSV_Files/
│ ├── Raw and cleaned datasets used for modeling
│
├── images/
│ ├── EDA plots, barplots, confusion matrices and so on.
│
├── EDA_report.html
│ ├── Automated profiling report (ydata-profiling)
│
├── Predicting Customer Credit Mix - End-to-End.ipynb
│ ├── Main notebook containing the full ML workflow
│
├── README.md
│ ├── Project documentation

📁 Cleaned Dataset Description

Total records: 50,000
Total features: 30
Target variable: Credit_Mix

Target variable --> Classes

Good
Standard
Bad

Feature Types

Numerical: 'Age', 'Annual_Income', 'Monthly_Inhand_Salary', 'Num_Bank_Accounts', 'Num_Credit_Card', 'Interest_Rate', 'Num_of_Loan', 'Debt_Consolidation_Loan', 'Home_Equity_Loan', 'Student_Loan', 'Payday_Loan', 'Personal_Loan', 'Auto_Loan', 'Mortgage_Loan', 'Credit-Builder_Loan', 'Num_of_Loan_Types', 'Delay_from_due_date', 'Num_of_Delayed_Payment', 'Changed_Credit_Limit', 'Num_Credit_Inquiries', 'Outstanding_Debt', 'Credit_Utilization_Ratio', 'Credit_History_Months', 'Total_EMI_per_month', 'Amount_invested_monthly', 'Monthly_Balance'.
Categorical: 'Occupation', 'Payment_of_Min_Amount', 'Payment_Behaviour_lavel', 'Payment_Behaviour_size'.

🔍 Exploratory Data Analysis (EDA)

EDA was performed using both manual visualizations and automated profiling. 📄 EDA Report: EDA_report.html

Key Analysis Steps

Distribution analysis
IQR-based outlier analysis
Feature correlation analysis
Class imbalance inspection

🧹 Data Preprocessing & Feature Engineering

Missing value handling
Outlier detection and treatment
Robust scaling for extreme numerical outliers
Ordinal encoding for categorical features
Model-specific preprocessing pipelines
Class imbalance handling using class weights

🤖 Machine Learning Models

The following ensemble models were trained and evaluated:

Random Forest
XGBoost
LightGBM

📈 Model Evaluation

Models were evaluated using multiple performance metrics:

Training Accuracy
Test Accuracy
Precision (Weighted)
Recall (Weighted)
F1-Score (Weighted)
Confusion Matrix

📊 Performance Summary (Test Set)

Model	Accuracy	F1-Score	ROC-AUC
Random Forest	0.966933	0.966884	0.997442
XGBoost	0.970933	0.970923	0.998082
LightGBM	0.964933	0.964884	0.997703

🧠 Key Insights

Gradient boosting models slightly outperformed Random Forest.
RobustScaler significantly improved performance on income-related features.
Class-weight handling effectively addressed class imbalance.

🛠️ Tools & Technologies

Programming Language: Python
Data Processing: Pandas, NumPy
Machine Learning: Scikit-learn, XGBoost, LightGBM,
Visualization: Matplotlib, Seaborn
EDA: ydata-profiling

🚀 How to Run the Project

Clone the repository
Install required dependencies
Open the notebook
Run cells sequentially

👤 ---- Author ----

Shahriar Hussain
Machine Learning & Data Science Practitioner

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📊 Credit Mix Classification – End-to-End Machine Learning Project

📌 Project Overview

🗂️ Project Structure

📁 Cleaned Dataset Description

Target variable --> Classes

Feature Types

🔍 Exploratory Data Analysis (EDA)

Key Analysis Steps

🧹 Data Preprocessing & Feature Engineering

🤖 Machine Learning Models

📈 Model Evaluation

📊 Performance Summary (Test Set)

🧠 Key Insights

🛠️ Tools & Technologies

🚀 How to Run the Project

👤 ---- Author ----

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
CSV_Files		CSV_Files
images		images
EDA_report.html		EDA_report.html
Predicting Customer Credit Mix - End-to-End Machine Learning Workflow.ipynb		Predicting Customer Credit Mix - End-to-End Machine Learning Workflow.ipynb
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

📊 Credit Mix Classification – End-to-End Machine Learning Project

📌 Project Overview

🗂️ Project Structure

📁 Cleaned Dataset Description

Target variable --> Classes

Feature Types

🔍 Exploratory Data Analysis (EDA)

Key Analysis Steps

🧹 Data Preprocessing & Feature Engineering

🤖 Machine Learning Models

📈 Model Evaluation

📊 Performance Summary (Test Set)

🧠 Key Insights

🛠️ Tools & Technologies

🚀 How to Run the Project

👤 ---- Author ----

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages