Goal: Learn the workflow FAANG expects after you have a model: diagnose errors, find slices that fail, run feature ablations, and iterate safely.
Outcome: Students can:
- build a baseline churn classifier,
- compute slice metrics and identify failure cohorts,
- perform error analysis (FP vs FN),
- run feature ablation and interpret results,
- tune threshold for business cost.
- Fork this repository.
- Open
debug_student_lab.ipynbin Google Colab. - Complete all TODO sections.
- Restart runtime → Run All cells.
- Push changes and submit a Pull Request.
- ✅ Start with a baseline, then debug systematically
- ✅ Slice analysis must be on a held-out set
- ✅ Keep preprocessing leakage-safe (Pipeline/ColumnTransformer)
- ✅ Separate ranking metrics from threshold metrics
Expected path:
data/churn/churn.csv
Common schema (Telco churn style):
- target:
Churn(Yes/No) orchurn(0/1)
If the file is missing, the notebook uses a small synthetic churn-like dataset so it still runs.
- numeric + categorical preprocessing
- LogisticRegression baseline
Checkpoint Questions:
- Why is a pipeline required for trustworthy debugging?
Interview Angle:
- Which is worse: FP or FN? (Depends on business cost.)
Examples:
ContractInternetServicePaymentMethod- tenure bucket
FAANG Gotcha:
- “Overall accuracy looks fine” can hide severe cohort failures.
- remove all service columns
- remove price columns
- remove contract columns
Example: FN costs 5x FP.
- Baseline + holdout metrics
- At least 3 slice tables with findings
- Feature ablation summary
- Threshold choice explained
| Skill | Evaluated |
|---|---|
| Debugging workflow | ✅ |
| Slice analysis quality | ✅ |
| Feature reasoning | ✅ |
| Threshold/cost reasoning | ✅ |
| Communication clarity | ✅ |
- Add calibration + ECE
- Add SHAP/permutation importance (conceptual or sklearn permutation)
- Add drift simulation by shifting tenure distribution