Skip to content

LozanoLsa/oee-intelligence

Repository files navigation

OEE Intelligence — Implementation Guide

Full-stack manufacturing analytics built in phases.
Start with a dashboard. Add a REST API. Deploy a Telegram bot. Enable natural language queries.
Each phase is independent — stop at any phase and still have a fully working system.

Python Streamlit scikit-learn FastAPI License


What This Project Builds

A manufacturing OEE analytics system that grows phase by phase:

Phase 1–6  →  Streamlit dashboard  (browser, localhost)
Phase 7    →  FastAPI backend      (REST API, localhost:8000)
Phase 8    →  Telegram bot         (button menus, mobile)
Phase 9    →  Natural language     (free-text queries, fully local AI)
Phase 10   →  Live SCADA           (real-time, coming soon)

Every phase reads from the same 10 CSV files. You can use the sample data included in this repository, or replace it with your own plant data — same column names, real numbers.


OEE Formula — Enforced at Every Row

OEE = Availability × Performance × Quality

Availability  =  Operating Time / Planned Time
Performance   =  (Actual Qty × Ideal Cycle Time) / Operating Time
Quality       =  Good Qty / Actual Qty

Plant Structure (sample data)

┌──────────────────────────────────────────────────────────┐
│               FINISHED GOODS WAREHOUSE                   │
├───────────────┬───────────────┬──────────────────────────┤
│  Assembly     │  Assembly     │  Assembly                │
│  Alpha        │  Beta         │  Gamma                   │
├───────────────┼───────────────┼──────────────────────────┤
│  Painting     │  Painting     │  Painting                │
├───────────────┼───────────────┼──────────────────────────┤
│  Machining    │  Machining    │  Machining               │
├───────────────┴───────────────┴──────────────────────────┤
│                RAW MATERIALS WAREHOUSE                   │
└──────────────────────────────────────────────────────────┘
    VSM Alpha        VSM Beta         VSM Gamma

35 machines · 3 VSMs · 3 areas · 3 shifts · 3 years of history (2022–2024)


Project Structure

OEE-Dashboard/
│
├── app.py                        ← Phases 1–6  Streamlit dashboard
├── oee_data_dictionary.md        ← Phase 0     Full schema for all 10 tables
├── BOT_GUIDE.md                  ← Phases 8–9  Telegram bot user guide
├── requirements.txt              ← All phases  Python dependencies
├── .env.example                  ← Phase 8     Token template (copy → .env)
│
├── .streamlit/
│   └── config.toml               ← Phase 1     Dark theme
│
├── data/                         ← Phase 0     10 CSV files (sample or real)
│   ├── Ordenes_Produccion_OEE.csv
│   ├── Paros_Eventos_OEE.csv
│   ├── Hourly_Snapshots_OEE.csv
│   ├── Maquinas_OEE.csv
│   ├── Productos_OEE.csv
│   ├── Catalogo_Paros_OEE.csv
│   ├── SCADA_Signals_OEE.csv
│   ├── Operator_Notes_OEE.csv
│   ├── Work_Orders_OEE.csv
│   └── Calendario_OEE.csv
│
├── api/                          ← Phase 7     FastAPI REST backend
│   ├── main.py
│   ├── data_loader.py
│   ├── schemas.py
│   └── routes/
│       ├── availability.py
│       ├── performance.py
│       └── quality.py
│
└── bot/                          ← Phases 8–9  Telegram bot
    ├── bot.py
    ├── handlers.py
    ├── keyboards.py
    ├── queries.py
    ├── formatters.py
    ├── display.py
    └── nl_handler.py

Prerequisites — All Phases

git clone https://github.com/your-username/OEE-Dashboard.git
cd OEE-Dashboard
pip install -r requirements.txt

Python 3.10 or higher required.


Phase 0 — Data Setup

Before running anything, you need the 10 CSV files in data/.

Option A — Use the sample data (already included)

The data/ folder already contains 3 years of synthetic production data for a 35-machine plant. You can run every phase immediately without any configuration.

Option B — Use your own plant data

Replace the CSV files in data/ with exports from your MES / CMMS / SCADA.
Your files must match the column names and data types described in oee_data_dictionary.md.

Minimum viable dataset — if your plant only has a shift report:

Column Source
Date, Shift, MachineID Shift log
PlannedTime_min, OperatingTime_min Downtime tracking
PlannedQty, ActualQty, GoodQty Production count

The dashboard computes Availability, Performance, Quality, and OEE from these columns at load time.

Typical source systems:

CSV file System
Ordenes_Produccion_OEE MES (Ignition, Wonderware, Opcenter)
Paros_Eventos_OEE MES downtime module or CMMS
Maquinas_OEE ERP equipment master (SAP PM, Maximo)
SCADA_Signals_OEE SCADA historian (FactoryTalk, WinCC)
Work_Orders_OEE CMMS corrective/preventive WOs

Phase 1–6 — Streamlit Dashboard

What you get: A 6-tab analytics dashboard with 7 ML models, running in your browser.

Run

streamlit run app.py

Opens at http://localhost:8501

Tabs

Tab Phase What it shows
📊 Overview 1 Fleet OEE cards, VSM breakdown, live shift status
📡 Operations 2 Gantt timeline, SCADA readings, operator notes, work orders
📅 History 3 3-year trend, downtime Pareto, shift/year comparison
🤖 AI · Availability 4 Model A: failure prediction · Model B: MTBF/MTTR reliability
AI · Performance 5 Model C: temporal evolution · Model D: weekly forecast
AI · Quality 6 Model E: FPY · Model F: Speed vs Quality · Model G: defect risk

ML Models

Model Type Predicts
A RandomForest Classifier Next failure event type
B Recency-weighted MTBF (half-life 180d) Time between failures + repair time
C Historical decomposition Performance evolution by VSM × Year × Area
D RandomForest Regressor Expected units per day (next 2 weeks)
E Aggregation First Pass Yield by VSM
F Scatter quadrant analysis Speed vs Quality tradeoff per machine
G RandomForest Regressor Defect rate by Shift × Product × Machine

Models train automatically on startup from the CSVs — no separate training step needed.


Phase 7 — FastAPI Backend

What you get: All 7 ML models exposed as REST endpoints. Integrate with any external system — ERP, Power BI, custom apps.

This phase is optional. The dashboard (Phases 1–6) and the bot (Phases 8–9) work without it.

Run

uvicorn api.main:app --reload --port 8000

API trains all models on startup (~30 sec), then serves requests at http://localhost:8000.

Interactive docs

URL Description
http://localhost:8000/docs Swagger UI — try every endpoint live
http://localhost:8000/redoc ReDoc reference
http://localhost:8000/health Liveness check

Example calls

# Predict next failure type for a machine
curl -X POST http://localhost:8000/api/v1/availability/predict \
  -H "Content-Type: application/json" \
  -d '{"machine_id":"LINE_03","failures_30d":3,"days_since_pm":45}'

# MTBF fleet table filtered by VSM
curl http://localhost:8000/api/v1/availability/mtbf?vsm=Beta

# Defect risk for a shift × product combination
curl -X POST http://localhost:8000/api/v1/quality/risk \
  -H "Content-Type: application/json" \
  -d '{"machine_id":"LINE_20","shift":"Night","product_id":"PROD_005","day_of_week":0}'

Phase 8 — Telegram Bot

What you get: All 13 analytics queries accessible from your phone via button menus. Plus an automatic daily KPI report at 08:00.

Setup

1. Create a Telegram bot

Open Telegram → search @BotFather/newbot → follow prompts → copy the token.

2. Get your Telegram user ID

Search @userinfobot on Telegram → it replies with your numeric ID.

3. Configure environment variables

cp .env.example .env

Edit .env:

TELEGRAM_TOKEN=your_token_from_botfather
TELEGRAM_CHAT_ID=your_numeric_user_id

.env is in .gitignore — it will never be committed to GitHub.

4. Run the bot

python -m bot.bot

The bot trains all 7 models on startup (~30 sec), then begins listening. Keep the terminal open.

What the bot can do

Category Queries available
📊 Overview Plant Snapshot · Critical Machines
🔴 Availability Predict Next Failure · MTBF Fleet · MTBF Single Machine
⚡ Performance VSM × Year Evolution · Area Monthly Deviation · Weekly Forecast
⭐ Quality FPY by VSM · Speed vs Quality · Defect Risk Predictor
📋 Automatic Daily KPI Report at 08:00

Period options for every descriptive query:
Last 7 days · Last 30 days · Last 3 months · Last 6 months · Year to date · 2024 · 2023 · 2022

Commands:

Command Description
/start Open main menu
/menu Return to main menu from anywhere
/status Quick plant snapshot (last 30 days)
/alerts Toggle the daily 08:00 KPI report on/off

See BOT_GUIDE.md for the complete query reference with sample outputs.


Phase 9 — Natural Language Queries

What you get: Type questions in plain Spanish or English — no menus, no buttons. The bot understands the question, calls the right query, and replies with real data.

Powered by Ollama + llama3.2 — runs 100% locally. No API keys, no internet connection required, no cost, data never leaves your machine.

Setup

1. Install Ollama

Download from ollama.com/download and install. Ollama starts automatically in the background.

2. Pull the model (one-time, ~2 GB)

ollama pull llama3.2

3. Run the bot — same command as Phase 8

python -m bot.bot

Natural language is active immediately alongside the button menus.

Example questions

"¿Cómo estuvo el OEE el mes pasado?"
"Cuáles son las peores máquinas en 2024?"
"Show me the speed vs quality tradeoff for Beta"
"Dame el FPY del último trimestre"
"What's the MTBF risk fleet-wide?"
"Dame el reporte de hoy"

How it works

User types question
       ↓
Ollama (llama3.2) identifies which query to call + what parameters
       ↓
Bot executes the real query against the CSV data
       ↓
Ollama generates a one-sentence natural language summary
       ↓
User receives: summary + full formatted data + menu buttons

Running Everything Together

The three processes are completely independent — each runs in its own terminal:

Terminal Command Port Required for
1 streamlit run app.py 8501 Dashboard (F1–F6)
2 uvicorn api.main:app --reload 8000 REST API (F7)
3 python -m bot.bot none Telegram bot (F8–F9)

You can run any combination. They do not interfere with each other.


Key Design Decisions

Why a single app.py?
All ML models train via @st.cache_data on first load — no separate pipeline. For production scale (>1M rows), split into a src/ ETL layer with pre-trained model storage.

Why RandomForest over LSTM or XGBoost?
With 3 years × 35 machines × 3 shifts ≈ 100K rows, RandomForest gives interpretable, stable results without overfitting risk. Feature importance is human-readable. LSTM becomes relevant at sensor-level (1-second) granularity with continuous streaming data.

Why recency-weighted MTBF?
A machine with 10 failures three years ago but zero in the last 6 months should not carry the same risk as one that just failed last week. Exponential decay (half-life = 180 days) assigns higher weight to recent events and exposes ESS (Effective Sample Size) as a confidence indicator.

Why Ollama over a cloud LLM?
Manufacturing KPIs are confidential. Running the language model locally means production data never leaves the plant network. Ollama with llama3.2 also has zero ongoing cost, no rate limits, and no dependency on external availability.

Why no database?
CSV files are the universal export format of every MES, CMMS, and SCADA system. No ETL, no schema migration, no DBA required. When row count exceeds ~5M, swap pd.read_csv for a DuckDB or PostgreSQL connector — the rest of the stack stays identical.


Roadmap

  • Phase 1: OEE decomposition dashboard (Availability, Performance, Quality)
  • Phase 2: Live operations view (Gantt, SCADA, shift notes, work orders)
  • Phase 3: Historical analytics (3-year trends, Pareto, shift/year comparison)
  • Phase 4: Predictive Availability (RandomForest classifier + recency-weighted MTBF)
  • Phase 5: Predictive Performance (temporal evolution, pre-PM degradation, weekly forecast)
  • Phase 6: Quality Intelligence (FPY by VSM, Speed vs Quality tradeoff, risk predictor)
  • Phase 7: FastAPI backend — all 7 models as REST endpoints
  • Phase 8: Telegram bot — 13 button-driven queries + daily KPI report
  • Phase 9: Natural language interface — free-text queries via Ollama (local, zero cost)
  • Phase 10: Real-time SCADA integration (MQTT / OPC-UA → live dashboard)

LozanoLsa · Turning Operations into Predictive Systems

About

Full-stack manufacturing analytics — OEE dashboard, 7 ML models, REST API, Telegram bot, and natural language queries. Built phase by phase from CSV to AI.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Contributors

Languages