A machine learning tool that compares traditional rules-based lead scoring with predictive models to demonstrate the ROI of data-driven marketing operations.
Most B2B companies use traditional lead scoring - arbitrary point values assigned to demographic and behavioral attributes:
- Enterprise company = +25 points
- Email click = +5 points
- MQL threshold = 50 points
The issues:
- Arbitrary rules - Point values based on intuition, not data
- Poor accuracy - 35-40% of "qualified" leads never convert
- Wasted sales time - Reps spend hours on leads that won't close
- Missed opportunities - Good leads slip through with low scores
ML models analyze historical conversion patterns to predict which leads actually buy:
| Metric | Traditional Scoring | ML Model | Improvement |
|---|---|---|---|
| Accuracy | 35.3% | 79.0% | +43.7 pts |
| False Positives | 963 bad leads | 77 bad leads | -886 (-92%) |
| Sales Time Wasted | 1,926 hours | 154 hours | 1,772 hours saved |
| Cost Savings | - | $132,900/year | - |
Creates 5,000 synthetic B2B leads with:
- Firmographics: Company size, industry, source
- Engagement: Email opens, content downloads, webinar attendance
- Conversion outcome: Did they become a customer?
Implements a typical rules-based scoring model:
score = (company_size_points +
industry_points +
email_opens * 2 +
demo_request * 30)- Logistic Regression - Simple, interpretable baseline
- Random Forest - More sophisticated pattern detection
Side-by-side analysis of:
- Accuracy, precision, recall
- ROC curves
- Business impact ($ savings)
Shows which features matter most (often surprising):
Top Predictive Features:
page_views(engagement depth)email_opens(interest level)email_clicks(intent)company_size(fit)days_since_first_touch(timing)
model | accuracy | precision | recall | roc_auc
Traditional Scoring | 0.353 | 0.289 | 0.950 | 0.572
Logistic Regression | 0.771 | 0.679 | 0.783 | 0.852
Random Forest | 0.790 | 0.710 | 0.788 | 0.871
metric | traditional | ml_model | improvement
False Positives (Bad Leads to Sales) | 963 | 77 | 886 fewer (-92.0%)
Wasted Sales Hours | 1,926 | 154 | 1,772 hours saved
Wasted Sales Cost | $144,450 | $11,550 | $132,900 saved
Total Financial Impact | -$3,430,650 | -$113,550 | $3,317,100 improvement
Every lead gets both scores for comparison:
lead_id | company_size | industry | traditional_score | ml_probability | converted
L000123 | Enterprise | Tech | 78 | 0.85 | 1
L000456 | SMB | Retail | 52 | 0.12 | 0
Scenario: Sales complains 60% of MQLs are junk.
Action:
- Run this analysis on your historical data
- Compare traditional vs ML accuracy
- Present ROI to leadership ($130K+ savings)
- Deploy ML scoring in HubSpot/Marketo
Outcome: Reduce false-positive MQLs by 90%, save 1,700+ sales hours/year.
Scenario: CMO asks "which lead sources actually convert?"
Action:
- Train ML model on historical data
- Check feature importance chart
- See which sources predict conversion
- Reallocate budget away from low-value sources
Outcome: Data-driven budget allocation, not guesswork.
Scenario: Sales won't follow up on leads because "marketing's scores are wrong."
Action:
- Show ML model accuracy (79%) vs traditional (35%)
- Demonstrate 92% reduction in bad leads
- Calculate sales time savings (1,700+ hours)
- Get buy-in for implementation
Outcome: Sales actually trusts the scoring model.
- Python 3.8+
- pip
pip install -r requirements.txtcd scripts
python lead_scoring_builder.pyThis will:
- Generate 5,000 synthetic leads
- Apply traditional scoring
- Train ML models (Logistic Regression + Random Forest)
- Evaluate and compare performance
- Calculate business impact
- Export results and visualizations
~30 seconds on a standard laptop
/data/
└── leads.csv # Raw lead data
/output/
├── scored_leads.csv # All leads with both scores
├── model_performance_metrics.csv # Accuracy, precision, recall
├── business_impact_comparison.csv # $ savings analysis
├── roc_curve_comparison.png # Visual performance comparison
├── feature_importance.png # What actually predicts conversion
└── score_distributions.png # Score distributions by outcome
/models/
├── random_forest_model.pkl # Trained model (reusable)
└── logistic_regression_model.pkl # Trained model (reusable)
Shows ML model dramatically outperforms traditional scoring (AUC: 0.871 vs 0.572)
Reveals which attributes actually predict conversion (often surprising - e.g., page views matter more than company size)
Shows traditional scoring poorly separates converters from non-converters
from hubspot import HubSpot
api_client = HubSpot(access_token='your_token')
# Fetch contacts with lifecycle data
contacts = api_client.crm.contacts.basic_api.get_page(
properties=['lifecyclestage', 'company', 'hs_analytics_source']
)
# Convert to DataFrame
leads_df = pd.DataFrame([
{
'company_size': contact.properties.get('company_size'),
'industry': contact.properties.get('industry'),
'email_opens': contact.properties.get('hs_email_opens'),
'converted': 1 if contact.properties.get('lifecyclestage') == 'customer' else 0
}
for contact in contacts.results
])
# Train model on YOUR data
X, y, features = prepare_features(leads_df)
model = RandomForestClassifier()
model.fit(X, y)from simple_salesforce import Salesforce
sf = Salesforce(username='user', password='pass', security_token='token')
# Query leads
leads = sf.query_all("""
SELECT Id, Company, Industry, Email_Opens__c, IsConverted
FROM Lead
WHERE CreatedDate >= LAST_YEAR
""")
# Train model on Salesforce dataBased on 5,000 synthetic leads:
- Traditional Accuracy: 35.3% (barely better than random)
- ML Accuracy: 79.0% (2.2x better)
- ROC AUC: 0.871 (excellent discrimination)
- 886 fewer false positives (bad leads filtered out)
- 1,772 sales hours saved per year
- $132,900 cost savings in wasted sales time
- $3.3M total impact from better lead qualification
- Engagement depth (page views) - not just one visit
- Email engagement - opens AND clicks matter
- Company fit - size and industry combined
- Timing - days since first touch
- Intent signals - pricing page views, demo requests
| Tool | Purpose |
|---|---|
| Python | Core analysis |
| Pandas | Data manipulation |
| scikit-learn | Machine learning models |
| Matplotlib | Visualizations |
| Joblib | Model persistence |
-
Deploy to Production:
- Integrate with HubSpot/Marketo API
- Schedule weekly retraining
- Set up Slack alerts for model performance
-
Advanced Features:
- Multi-class scoring (MQL, SQL, Opportunity tiers)
- Real-time scoring via API endpoint
- A/B test ML vs traditional in production
-
Expand Scope:
- Predict deal size (regression)
- Predict time to close
- Churn prediction for customers