A supervised machine learning project predicting whether a customer will subscribe to a term deposit based on demographic and campaign data.
Developed an ML model to assist marketing teams in identifying customers most likely to respond positively to campaigns.
-
Data Exploration
- Performed EDA on both categorical and numerical data to gain insights into the variables.
- Utilized the insights to clean/preprocess the data.
- Visualised the response variable to understand the class imbalance.
-
Data Preprocessing
- Cleaned and encoded customer's demographic attributes (age, job, education, marital status, etc.).
- Handled outliers and cleaned numerical data.
- Handled missing values and categorical variables using
pandasandnumpy.
-
Dealing with Class Imbalance
- Applied SMOTE (Synthetic Minority Over-sampling Technique) to balance data the negative responses to positive responses.
-
Modeling
- Implemented a Logistic Regression, Random Forest Classifier, Ada Boost Classifier.
- Compared the evaluation metrics to each other and selected the Random Forest Classifier (best performance).
- Tuned hyperparameters using GridSearchCV.
-
Evaluation
- Achieved:
- AUC: 0.93
- Precision: 0.90
- Recall: 0.89
- Achieved: