Supply Chain Control Tower

Capstone analytics-engineering + ML + operations-research + DevOps platform. Two years of synthetic ERP supply chain data → dbt marts in DuckDB → Prophet demand forecasts (tracked in MLflow) → OR-Tools reorder-point optimization → 6-page Streamlit dashboard, all orchestrated by Airflow on Docker and shipped through GitHub Actions CI.

This is project 4 of a four-project portfolio. Each project layers a new production-grade capability on top of the previous one:

#	Project	Stack	What it adds
1	SaaS Growth	DuckDB raw + Streamlit + Sphinx	Baseline analytics
2	FinTech Credit Risk	+ dbt	Modern transformation layer
3	Ops Control Tower	+ Airflow + Docker	Production orchestration
4	Supply Chain (this)	+ Prophet + MLflow + OR-Tools + GitHub Actions CI + dbt snapshots	Full ML / OR / DevOps capstone

What it does

Generates 2 years of ERP-style supply-chain data — 500 SKUs, 20 vendors, 5 warehouses, ~1.1M rows of daily demand, 30K orders, 4K POs, 30K shipments — with realistic causal correlations (longer vendor lead times → higher arrival variance, seasonal SKUs → weekly + Black Friday spikes, etc.).
Transforms the raw parquet files into 9 staging views, 3 intermediate models, 6 marts, and 2 SCD-2 snapshots (vendor_contract_snapshot, sku_pricing_snapshot) with dbt-core + dbt-duckdb, including ~100 schema tests and 2 singular SQL tests.
Forecasts 30-day SKU × warehouse demand with Prophet (weekly + yearly seasonality, US holiday regressors), logging every fit to a dedicated MLflow tracking server with MAPE / WAPE / bias holdout metrics.
Optimizes reorder points with Google OR-Tools (GLOP solver, convex-cost LP over a candidate grid) subject to a per-ABC service-level floor; benchmarks the result against a textbook safety-stock formula and surfaces the cost savings on the dashboard.
Scores vendors with a four-component composite (lead-time overshoot, late-arrival rate, defect rate, lead-time variance) for the scorecard dashboard.
Visualizes everything in a 6-page Streamlit app — Executive Overview, Inventory Health, Forecast Accuracy, Replenishment, Vendor Scorecard, Warehouse Performance.
Orchestrates the daily pipeline with Airflow on Docker Compose. The DAG runs generate → dbt build (pre-forecast) → forecast → dbt build (post-forecast) → optimize → vendor scoring → alerts.
Lints, tests, and validates every push and PR via GitHub Actions (ruff + pytest + dbt parse), with Dependabot keeping deps current.

Architecture

flowchart LR
    subgraph ci [GitHub Actions CI]
        Lint["ruff"]
        Tests["pytest"]
        DbtParse["dbt parse"]
    end
    subgraph airflow [Airflow + MLflow in Docker]
        DAG["supply_chain_pipeline DAG @daily"]
        MLflowSrv["MLflow tracking server :5000"]
    end
    subgraph rawData [data/raw/ ERP parquet]
        Raw["8 tables"]
    end
    subgraph forecastDir [data/forecast/]
        Fcst["fact_demand_forecast.parquet"]
        Reco["reorder_recommendations.parquet"]
        Vend["vendor_scores.parquet"]
    end
    subgraph dbtProj [dbt project]
        Staging["9 stg_* + tests"]
        Snap["2 SCD-2 snapshots"]
        Marts["6 marts"]
    end
    subgraph mlOr [Python ML + OR layer]
        Prophet["forecast.py: Prophet"]
        OR["optimization.py: OR-Tools"]
        Vendor["vendor_scoring.py"]
    end
    subgraph dashboard [Streamlit App]
        Six["6 pages"]
    end

    DAG --> Raw --> Staging --> Snap --> Marts
    DAG --> Prophet --> Fcst --> Staging
    Prophet --> MLflowSrv
    Marts --> OR --> Reco
    Marts --> Vendor --> Vend
    Marts --> dashboard
    Reco --> dashboard
    Vend --> dashboard
    Fcst --> dashboard
    ci -.- dbtProj

Quickstart

Pre-requisites: pyenv, pyenv-virtualenv, docker, docker compose.

# 1. Set up the Python 3.13 env
pyenv install -s 3.13.3
pyenv virtualenv 3.13.3 supply_env
pyenv local supply_env

# 2. Bootstrap (installs core, dev, dbt, streamlit, docs deps + dbt packages)
make all-env

# 3. Start Airflow + MLflow tracking server in Docker
cp .env.example .env
make airflow-up
# Airflow UI: http://localhost:8080  (admin/admin)
# MLflow UI:  http://localhost:5000

# 4. Run the pipeline locally (host-side, faster for iteration)
make generate           # ERP parquet -> data/raw/
make dbt-build-pre      # 9 staging + 3 intermediate + 4 forecast-independent marts
export MLFLOW_TRACKING_URI=http://localhost:5000
make forecast           # Prophet + MLflow on top-50 SKUs
make dbt-build-post     # rebuild mart_forecast_accuracy + mart_reorder_recommendations
make optimize           # OR-Tools reorder-point optimization
make analyze            # vendor scoring + alerts log

# 5. Browse the dashboard
make app
# http://localhost:8501

# 6. Or trigger the DAG end-to-end
make airflow-trigger

Project layout

supply-chain-optimization/
├── airflow/                   custom Dockerfile + DAG
├── dbt_project/               9 staging + 3 intermediate + 6 marts + 2 snapshots
├── python/src/
│   ├── generate_data.py       8 ERP parquet tables, causally correlated
│   ├── forecast.py            Prophet + MLflow
│   ├── optimization.py        OR-Tools (GLOP)
│   ├── safety_stock.py        Textbook benchmark formulas
│   ├── vendor_scoring.py      Composite scorer
│   ├── analysis.py            Top-level orchestrator
│   └── metrics.py             KPI helpers
├── streamlit_app/             app.py + 6 pages + utils/data_loader.py
├── tests/                     104 Python tests (7 files)
├── docs/                      Sphinx (Furo theme) — 9 .rst pages
├── .github/                   ci.yml + dependabot.yml
├── docker-compose.yml         Airflow + Postgres + MLflow tracking server
├── pyproject.toml             prophet, mlflow, ortools + dbt + streamlit + docs extras
└── Makefile                   make help

Tests

make test (or pytest) runs 104 tests across 7 files:

test_generate_data.py — 25 schema + causal-correlation assertions on the generated parquet (e.g. seasonal SKUs must show wider weekly demand swings than non-seasonal ones).
test_dbt_pipeline.py — 24 integration tests against the materialized DuckDB warehouse (mart shapes, snapshot existence, cross-mart consistency).
test_forecast.py — Prophet wrapper returns the right horizon length; MLflow URI fail-fast behavior is verified; an end-to-end smoke test fits Prophet on synthetic data with a temp file-store URI.
test_optimization.py — OR-Tools optimizer respects the service-level floor; higher penalties → higher reorder points; tested over multiple service levels.
test_safety_stock.py — textbook formula matches known z-score calculations; combined-variance formula collapses to textbook when lead-time is certain.
test_vendor_scoring.py — composite score bounded in [0, 100]; ranking is stable for tied vendors; missing-column validation.
test_metrics.py — MAPE / WAPE / bias / fill-rate / OTIF helpers.

Plus ~100 dbt schema tests + 2 singular SQL tests run by dbt build.

Why this stack

Supply chain is the canonical home of two industry-distinctive analytics skills: time-series demand forecasting and mathematical optimization for inventory decisions. This project leans into both, plus adds CI/CD as the production-engineering polish.

Layer	Tool	Why
Source schema	ERP-style (SAP / NetSuite-inspired)	What real supply-chain analysts work with
Warehouse	DuckDB	Fast local analytics, zero infra
Transformation	dbt-core + dbt-duckdb	Industry-standard transformation
dbt snapshots	SCD-2 vendor + SKU history	Tracks contract drift over time
Orchestration	Apache Airflow (Docker Compose)	Same pattern as project 3, now with ML/OR steps
Forecasting	Prophet	Industry standard for SKU-level demand
Experiment tracking	MLflow	Find degrading models across hundreds of series
Optimization	Google OR-Tools (GLOP)	Cost-minimizing reorder points
Dashboard	Streamlit + Plotly	Same as previous projects
CI/CD	GitHub Actions	Lint + tests + dbt parse on every push
Docs	Sphinx + Furo	Same as previous projects

Documentation

The Sphinx site (Furo theme) covers:

business_problem.rst — pain points, target personas, success metrics
data_dictionary.rst — every source + transformed table, including the runtime forecast artifact at data/forecast/fact_demand_forecast.parquet
metric_dictionary.rst — every KPI with formula + business meaning
forecasting_methodology.rst — Prophet config, MLflow workflow, MLFLOW_TRACKING_URI env contract, fail-fast behavior
optimization_methodology.rst — LP formulation, decision variables, constraints, OR-Tools solver choice, vs textbook benchmark
orchestration.rst — Airflow setup, custom Dockerfile, the two-phase dbt build pattern (pre-forecast then post-forecast)
cicd.rst — GitHub Actions workflow walkthrough, Dependabot config
executive_summary.rst — one-page narrative

make docs builds + serves with live reload.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Supply Chain Control Tower

What it does

Architecture

Quickstart

Project layout

Tests

Why this stack

Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github		.github
airflow		airflow
data/sample		data/sample
dbt_project		dbt_project
docs		docs
python		python
streamlit_app		streamlit_app
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Supply Chain Control Tower

What it does

Architecture

Quickstart

Project layout

Tests

Why this stack

Documentation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages