Skip to content

amanniranjan1/Telecom_Churn_Prediction_Analysis

Repository files navigation

📊 Telecom Customer Churn Prediction & Analysis 🔍 Project Overview

Customer churn refers to customers who stop using a company’s services. This project focuses on building an end-to-end customer churn prediction system using Machine Learning (Python) and Power BI to help businesses identify high-risk customers and reduce churn.

The project demonstrates the complete workflow of a Data Analyst / Junior Data Scientist, from data preprocessing and model building to business-focused visualization.

🎯 Project Objectives

Predict whether a customer is likely to churn

Perform data preprocessing and feature engineering

Handle class imbalance using SMOTE

Train and compare multiple machine learning models

Evaluate models using business-relevant metrics

Visualize churn insights and high-risk customers using Power BI

🗂️ Dataset

Telco Customer Churn Dataset

Each row represents a customer

Target variable: Churn (Yes / No)

Features include:

Customer demographics (gender, senior citizen, dependents)

Services used (internet service, streaming, security)

Account information (tenure, contract type, charges)

🛠️ Tools & Technologies

Python

Pandas & NumPy

Scikit-learn

Matplotlib

Imbalanced-learn (SMOTE)

Power BI

Google Colab

GitHub

🔄 Project Workflow 1️⃣ Data Loading & Understanding

Load dataset

Inspect data structure and target variable

Identify categorical and numerical features

2️⃣ Exploratory Data Analysis (EDA)

Analyze churn distribution

Study churn behavior by contract, tenure, and payment method

3️⃣ Data Preprocessing

Handle missing values

Encode categorical variables

Scale numerical features

Split data into train and test sets

Handle class imbalance using SMOTE

4️⃣ Model Training

Logistic Regression

Random Forest Classifier

5️⃣ Model Evaluation

Confusion Matrix

Precision, Recall, F1-score

ROC–AUC Score

ROC Curve

Precision–Recall Curve

6️⃣ Model Selection & Saving

Best model selected based on ROC–AUC

Model serialized as a .pkl file

7️⃣ Power BI Dashboard

Visualize churn rate and churn drivers

Identify high-risk customers using churn probability

Interactive analysis using slicers

📊 Power BI Dashboard Features

Customer Churn Rate (KPI Card)

Churn by Contract Type

Churn by Payment Method

Churn by Tenure

High-Risk Customers Table (Churn Probability ≥ 0.7)

Interactive slicers:

Contract

Internet Service

Payment Method

Gender

✅ Conclusion

The project successfully predicts customer churn by learning patterns from historical telecom customer data. Among the trained models, Random Forest outperformed Logistic Regression, achieving a higher ROC–AUC score and better recall for churned customers. The analysis shows that month-to-month contracts, low tenure, and certain payment methods are strong indicators of churn.

By combining machine learning predictions with Power BI visualizations, the project enables businesses to identify high-risk customers early and take proactive retention actions.

💾 Model File Note

Due to GitHub file size limitations, the trained model file (.pkl) is not included in this repository. The model can be recreated easily by running the notebook end-to-end, ensuring full reproducibility of results.

👤 Author

Aman Niranjan

About

End-to-end customer churn prediction using Python, machine learning, and Power BI.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors