This project conducts an Exploratory Data Analysis (EDA) on Netflix customer churn, actively utilizing machine learning for predictive insights to understand what drives customer churn and provide actionable recommendations for retention strategies.
- Source: Kaggle-Netflix Customer Churn dataset
- Size: 5,000 customers
- Features: User demographics, user behavior, and product information
Customer churn is primarily driven by user engagement metrics, not demographics or subscription type.
avg_watch_time_per_day- Explains ~40% of churn variancewatch_hours- Total viewing timelast_login_days- Recency of platform usage
Combined Impact: These three features explain ~86% of the variance in customer churn behavior.
- High Impact: Engagement metrics (watch time, login frequency)
- Low Impact: Demographics (age, gender, region) and subscription type
- Insight: Netflix retention is behavior-driven, not product-tier driven
- Random Forest: ~92% accuracy in predicting customer churn
- Logistic Regression: ~89% accuracy in predicting customer churn with better interpretability
- ROC Analysis: Both models show strong discriminative power (AUC > 0.9)
- Categorical Encoding: One-hot encoding for categorical features
- Feature Scaling: StandardScaler for continuous variables
- Feature Engineering: Age bucketing and correlation analysis
- Stable performance across different data splits
- Minimal overfitting detected
- Strong generalization capability
| Model | Accuracy | AUC | Key Strengths |
|---|---|---|---|
| Random Forest | ~92% | 0.98 | Feature importance insights |
| Logistic Regression | ~89% | 0.96 | Interpretable coefficients |
-
Improve Content Quality
- Invest in high-quality original content
- Acquire popular shows and movies
- Focus on content that drives binge-watching behavior
-
Enhance User Experience
- Optimize user interface for content discovery
- Improve platform navigation and visual appeal
- Reduce friction in content consumption
-
Advanced Recommendation System
- Implement personalized recommendation algorithms
- Use viewing history and preferences for better content matching
- Increase content discovery and engagement
-
Key Performance Indicators (KPIs)
- Monitor
avg_watch_time_per_daytrends - Track
watch_hourspatterns - Alert on
last_login_daysincreases
- Monitor
-
Predictive Dashboard
- Real-time churn risk scoring
- User engagement trend analysis
- Early intervention triggers
-
Content Personalization
- Personalized content recommendations
- Content alerts for favorite genres/actors
- Exclusive previews for at-risk users
-
Re-Engagement & Retention
- Re-engagement campaigns for inactive users
- Special offers for users showing declining engagement
-
Communication & Notifications
- Personalized notifications based on viewing history
-
Engagement Over Everything: User engagement metrics are the strongest predictors of churn, significantly outweighing demographic or subscription factors.
-
Predictable Patterns: Customer churn can be predicted with ~90% accuracy using engagement-based features.
-
Actionable Intelligence: The analysis provides clear, data-driven recommendations focusing on content quality and user engagement rather than pricing strategies.
-
Business Impact: Netflix should prioritize user engagement initiatives over demographic targeting or subscription tier modifications for churn reduction.
This analysis demonstrates that in the streaming industry, content engagement is the ultimate retention driver.