Skip to content

Flit ML: Complete Project Implementation Plan #1

@whitehackr

Description

@whitehackr

Flit ML Project: Complete Phase Implementation Plan

Overview

Build ML components for the flit ecosystem, focusing on predictive models for financial risk assessment for a BNPL product. This is the first ML project at Flit, requiring research-first approach followed by production infrastructure.

Phase -1: Data Infrastructure (whitehackr/flit-data-platform#9)

Objectives

  • Generate and store 3 months of synthetic BNPL data
  • Set up data warehouse for ML research
  • Build data pipeline for ongoing data collection

Deliverables

  • Data Generation Pipeline

    • Airflow DAG to collect data from simtom API
    • Generate realistic 3-month historical dataset
    • Data quality validation and monitoring
  • Data Warehouse Setup

    • BigQuery database with proper schema
    • Organized tables (transactions, users, risk_events, etc.)
    • Partitioned for fast analytical queries
  • Data Pipeline

    • Automated daily data collection from simtom
    • Data cleaning and validation with Great Expectations
    • Backup and versioning strategy

Technical Tasks

  • Simtom Enhancement: Create PR to add date range parameters (/stream/bnpl?start_date=2024-06-01&end_date=2024-09-01)
  • Database Setup: Design BigQuery schema, create tables
  • Airflow Setup: DAG for data collection, scheduling
  • Data Generation: Script to generate 3 months of historical data
  • Validation: Great Expectations suite for data quality

Tooling Required

  • Database: google-cloud-bigquery, pandas-gbq
  • Orchestration: apache-airflow
  • Validation: great-expectations
  • Simtom Enhancement: Date range API feature

Phase 0: Research & Discovery

Objectives

  • Understand BNPL data patterns and business problem
  • Define ML problem clearly (classification vs regression, target variable)
  • Establish baseline performance metrics
  • Identify best performing model architectures

Deliverables

  • Data Understanding Report

    • Data schema documentation
    • Statistical summary of all features
    • Data quality assessment (missing values, outliers)
  • Problem Definition Document

    • Clear target variable definition (default probability? risk score?)
    • Success metrics (precision, recall, AUC, business metrics)
    • Model performance thresholds for production
  • Model Experimentation Results

    • Baseline model performance (simple logistic regression)
    • Comparison of 5-7 different algorithms
    • Feature importance analysis
    • Recommended models for production

Technical Tasks (Some of these notebooks could be combined depending on logical workflow, and io overhead)

  • 01_data_exploration.ipynb: Connect to BigQuery, basic data inspection
  • 02_eda_analysis.ipynb: Deep statistical analysis, visualizations
  • 03_feature_engineering.ipynb: Create derived features, handle categorical data
  • 04_baseline_models.ipynb: Simple models (logistic regression, decision tree)
  • 05_advanced_models.ipynb: XGBoost, Random Forest, Neural Networks
  • 06_model_comparison.ipynb: Cross-validation, performance comparison
  • 07_final_recommendations.ipynb: Model selection with business justification

Tooling Required

  • Jupyter Notebook: Interactive development
  • Additional packages: matplotlib, seaborn, plotly, mlflow
  • Model libraries: lightgbm, catboost
  • Data processing: pandas, numpy (already included)

Phase 1: Production Infrastructure

Objectives

  • Build production-ready ML serving infrastructure
  • Implement model versioning and deployment pipeline
  • Create monitoring and observability

Deliverables

  • Model Serving API

    • FastAPI service with prediction endpoints
    • Model loading and caching
    • Input validation and error handling
  • Model Registry System

    • Model versioning and storage
    • A/B testing capabilities
    • Model rollback functionality
  • Monitoring Dashboard

    • Prediction latency metrics
    • Model performance monitoring
    • Data drift detection

Technical Tasks

  • Core Architecture: Base classes, model registry, plugin system
  • API Development: FastAPI endpoints, async request handling
  • Model Deployment: Model loading, caching, version management
  • Data Pipeline: Real-time feature engineering, validation
  • Monitoring: Metrics collection, alerting, dashboards
  • Testing: Unit tests, integration tests, load tests

Tooling Required

  • FastAPI: Already included
  • Model Storage: joblib, pickle, or mlflow model registry
  • Monitoring: prometheus, grafana or simple logging
  • Caching: redis (optional for model caching)
  • Container: docker for deployment

Phase 2: Real-time Processing

Objectives

  • Handle streaming data from simtom
  • Implement real-time feature engineering
  • Build batch prediction capabilities

Deliverables

  • Streaming Data Pipeline

    • Real-time data ingestion from simtom
    • Feature engineering on streaming data
    • Batch processing for historical data
  • Real-time Prediction Service

    • Low-latency prediction API (<100ms)
    • Async processing for high throughput
    • Queue management for spike handling

Technical Tasks

  • Stream Processing: Async data consumption from simtom API
  • Feature Store: Real-time feature computation and storage
  • Batch Processing: Historical data processing for model retraining
  • Queue Management: Handle prediction request spikes

Tooling Required

  • Streaming: asyncio, httpx (already included)
  • Message Queue: celery + redis or simple async queues
  • Feature Store: redis or in-memory with persistence
  • Batch Processing: pandas for data processing

Phase 3: Model Operations (MLOps)

Objectives

  • Automated model retraining pipeline
  • A/B testing framework
  • Model performance monitoring

Deliverables

  • Automated Training Pipeline

    • Scheduled model retraining
    • Data validation before training
    • Automated model evaluation and deployment
  • A/B Testing Framework

    • Traffic splitting between model versions
    • Statistical significance testing
    • Automated winner selection
  • Performance Monitoring

    • Model drift detection
    • Performance degradation alerts
    • Business metrics tracking

Technical Tasks

  • Training Automation: Scheduled training jobs, data validation
  • A/B Testing: Traffic routing, experiment management
  • Monitoring: Data drift detection, performance tracking
  • Alerting: Automated alerts for model issues

Tooling Required

  • Scheduling: celery beat or cron jobs
  • Experiment Management: Custom A/B testing or mlflow
  • Monitoring: evidently for data drift, custom metrics
  • Alerting: slack webhooks or email notifications

Phase 4: Advanced Features

Objectives

  • Model interpretability and explainability
  • Advanced model architectures
  • Integration with flit ecosystem

Deliverables

  • Model Explainability

    • SHAP values for predictions
    • Feature importance explanations
    • Model decision boundaries
  • Advanced Models

    • Ensemble methods
    • Deep learning models (if beneficial)
    • Time-series models for temporal patterns
  • Ecosystem Integration

    • Integration with flit-data-platform
    • Connection to production flit services
    • Business metrics dashboard

Technical Tasks

  • Explainability: SHAP implementation, visualization
  • Advanced Models: Ensemble methods, neural networks
  • Integration: APIs for flit ecosystem, data connectors
  • Business Metrics: Revenue impact tracking, risk assessment

Tooling Required

  • Explainability: shap, lime, eli5
  • Deep Learning: pytorch or tensorflow (if needed)
  • Visualization: streamlit for dashboards
  • Integration: Custom APIs, database connectors

Phase 5: Production Deployment (1-2 weeks)

Objectives

  • Deploy to production environment
  • Load testing and performance optimization
  • Documentation and handover

Deliverables

  • Production Deployment

    • Railway deployment configuration
    • Environment management
    • Load balancing and scaling
  • Documentation

    • API documentation
    • Model documentation
    • Operational runbooks
  • Performance Validation

    • Load testing results
    • Performance benchmarks
    • Monitoring setup verification

Technical Tasks

  • Deployment: Railway configuration, environment setup
  • Testing: Load testing, stress testing
  • Documentation: API docs, model cards, operational guides
  • Handover: Knowledge transfer, operational procedures

Tooling Required

  • Deployment: railway CLI, docker
  • Load Testing: locust or wrk
  • Documentation: mkdocs or simple markdown
  • Monitoring: Production monitoring setup

Technology Stack Summary

Core ML Stack

  • Python: 3.11+ (already set)
  • ML Libraries: scikit-learn, xgboost, pandas, numpy (already included)
  • API: FastAPI + uvicorn (already included)
  • Validation: Pydantic (already included)

Additional Requirements by Phase

  • Phase -1: google-cloud-bigquery, pandas-gbq, apache-airflow, great-expectations
  • Phase 0: jupyter, matplotlib, seaborn, plotly, mlflow
  • Phase 1: redis (optional), prometheus (optional)
  • Phase 2: celery (optional), message queue
  • Phase 3: evidently, experiment tracking
  • Phase 4: shap, streamlit, pytorch (optional)
  • Phase 5: locust, mkdocs

Infrastructure

  • Development: Poetry + virtual env (already set)
  • Database: BigQuery
  • Deployment: Railway
  • Storage: GCP Cloud Storage
  • Monitoring: Simple logging initially, then proper monitoring

Dependencies

  1. Simtom API Enhancement: Need to contribute date range feature to simtom project before Phase -1
  2. BigQuery Setup: GCP project and BigQuery dataset creation
  3. Airflow Environment: Local Airflow setup or cloud-managed Airflow

Success Criteria

  • Phase -1: 3 months of quality BNPL data in BigQuery
  • Phase 0: Clear model recommendations with >75% baseline accuracy
  • Phase 1: Production API with <100ms latency, 99%+ uptime
  • Phase 2: Handle 100+ predictions/second
  • Phase 3: Automated retraining and A/B testing
  • Phase 4: Model explainability and advanced features
  • Phase 5: Full production deployment on Railway

This issue will be updated as we progress through each phase. Each phase will have its own sub-issues for detailed tracking.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions