OEE Intelligence — Implementation Guide

Full-stack manufacturing analytics built in phases.
Start with a dashboard. Add a REST API. Deploy a Telegram bot. Enable natural language queries.
Each phase is independent — stop at any phase and still have a fully working system.

What This Project Builds

A manufacturing OEE analytics system that grows phase by phase:

Phase 1–6  →  Streamlit dashboard  (browser, localhost)
Phase 7    →  FastAPI backend      (REST API, localhost:8000)
Phase 8    →  Telegram bot         (button menus, mobile)
Phase 9    →  Natural language     (free-text queries, fully local AI)
Phase 10   →  Live SCADA           (real-time, coming soon)

Every phase reads from the same 10 CSV files. You can use the sample data included in this repository, or replace it with your own plant data — same column names, real numbers.

OEE Formula — Enforced at Every Row

OEE = Availability × Performance × Quality

Availability  =  Operating Time / Planned Time
Performance   =  (Actual Qty × Ideal Cycle Time) / Operating Time
Quality       =  Good Qty / Actual Qty

Plant Structure (sample data)

┌──────────────────────────────────────────────────────────┐
│               FINISHED GOODS WAREHOUSE                   │
├───────────────┬───────────────┬──────────────────────────┤
│  Assembly     │  Assembly     │  Assembly                │
│  Alpha        │  Beta         │  Gamma                   │
├───────────────┼───────────────┼──────────────────────────┤
│  Painting     │  Painting     │  Painting                │
├───────────────┼───────────────┼──────────────────────────┤
│  Machining    │  Machining    │  Machining               │
├───────────────┴───────────────┴──────────────────────────┤
│                RAW MATERIALS WAREHOUSE                   │
└──────────────────────────────────────────────────────────┘
    VSM Alpha        VSM Beta         VSM Gamma

35 machines · 3 VSMs · 3 areas · 3 shifts · 3 years of history (2022–2024)

Project Structure

OEE-Dashboard/
│
├── app.py                        ← Phases 1–6  Streamlit dashboard
├── oee_data_dictionary.md        ← Phase 0     Full schema for all 10 tables
├── BOT_GUIDE.md                  ← Phases 8–9  Telegram bot user guide
├── requirements.txt              ← All phases  Python dependencies
├── .env.example                  ← Phase 8     Token template (copy → .env)
│
├── .streamlit/
│   └── config.toml               ← Phase 1     Dark theme
│
├── data/                         ← Phase 0     10 CSV files (sample or real)
│   ├── Ordenes_Produccion_OEE.csv
│   ├── Paros_Eventos_OEE.csv
│   ├── Hourly_Snapshots_OEE.csv
│   ├── Maquinas_OEE.csv
│   ├── Productos_OEE.csv
│   ├── Catalogo_Paros_OEE.csv
│   ├── SCADA_Signals_OEE.csv
│   ├── Operator_Notes_OEE.csv
│   ├── Work_Orders_OEE.csv
│   └── Calendario_OEE.csv
│
├── api/                          ← Phase 7     FastAPI REST backend
│   ├── main.py
│   ├── data_loader.py
│   ├── schemas.py
│   └── routes/
│       ├── availability.py
│       ├── performance.py
│       └── quality.py
│
└── bot/                          ← Phases 8–9  Telegram bot
    ├── bot.py
    ├── handlers.py
    ├── keyboards.py
    ├── queries.py
    ├── formatters.py
    ├── display.py
    └── nl_handler.py

Prerequisites — All Phases

git clone https://github.com/your-username/OEE-Dashboard.git
cd OEE-Dashboard
pip install -r requirements.txt

Python 3.10 or higher required.

Phase 0 — Data Setup

Before running anything, you need the 10 CSV files in data/.

Option A — Use the sample data (already included)

The data/ folder already contains 3 years of synthetic production data for a 35-machine plant. You can run every phase immediately without any configuration.

Option B — Use your own plant data

Replace the CSV files in data/ with exports from your MES / CMMS / SCADA.
Your files must match the column names and data types described in oee_data_dictionary.md.

Minimum viable dataset — if your plant only has a shift report:

Column	Source
`Date`, `Shift`, `MachineID`	Shift log
`PlannedTime_min`, `OperatingTime_min`	Downtime tracking
`PlannedQty`, `ActualQty`, `GoodQty`	Production count

The dashboard computes Availability, Performance, Quality, and OEE from these columns at load time.

Typical source systems:

CSV file	System
`Ordenes_Produccion_OEE`	MES (Ignition, Wonderware, Opcenter)
`Paros_Eventos_OEE`	MES downtime module or CMMS
`Maquinas_OEE`	ERP equipment master (SAP PM, Maximo)
`SCADA_Signals_OEE`	SCADA historian (FactoryTalk, WinCC)
`Work_Orders_OEE`	CMMS corrective/preventive WOs

Phase 1–6 — Streamlit Dashboard

What you get: A 6-tab analytics dashboard with 7 ML models, running in your browser.

Run

streamlit run app.py

Opens at http://localhost:8501

Tabs

Tab	Phase	What it shows
📊 Overview	1	Fleet OEE cards, VSM breakdown, live shift status
📡 Operations	2	Gantt timeline, SCADA readings, operator notes, work orders
📅 History	3	3-year trend, downtime Pareto, shift/year comparison
🤖 AI · Availability	4	Model A: failure prediction · Model B: MTBF/MTTR reliability
⚡ AI · Performance	5	Model C: temporal evolution · Model D: weekly forecast
⭐ AI · Quality	6	Model E: FPY · Model F: Speed vs Quality · Model G: defect risk

ML Models

Model	Type	Predicts
A	RandomForest Classifier	Next failure event type
B	Recency-weighted MTBF (half-life 180d)	Time between failures + repair time
C	Historical decomposition	Performance evolution by VSM × Year × Area
D	RandomForest Regressor	Expected units per day (next 2 weeks)
E	Aggregation	First Pass Yield by VSM
F	Scatter quadrant analysis	Speed vs Quality tradeoff per machine
G	RandomForest Regressor	Defect rate by Shift × Product × Machine

Models train automatically on startup from the CSVs — no separate training step needed.

Phase 7 — FastAPI Backend

What you get: All 7 ML models exposed as REST endpoints. Integrate with any external system — ERP, Power BI, custom apps.

This phase is optional. The dashboard (Phases 1–6) and the bot (Phases 8–9) work without it.

Run

uvicorn api.main:app --reload --port 8000

API trains all models on startup (~30 sec), then serves requests at http://localhost:8000.

Interactive docs

URL	Description
`http://localhost:8000/docs`	Swagger UI — try every endpoint live
`http://localhost:8000/redoc`	ReDoc reference
`http://localhost:8000/health`	Liveness check

Example calls

# Predict next failure type for a machine
curl -X POST http://localhost:8000/api/v1/availability/predict \
  -H "Content-Type: application/json" \
  -d '{"machine_id":"LINE_03","failures_30d":3,"days_since_pm":45}'

# MTBF fleet table filtered by VSM
curl http://localhost:8000/api/v1/availability/mtbf?vsm=Beta

# Defect risk for a shift × product combination
curl -X POST http://localhost:8000/api/v1/quality/risk \
  -H "Content-Type: application/json" \
  -d '{"machine_id":"LINE_20","shift":"Night","product_id":"PROD_005","day_of_week":0}'

Phase 8 — Telegram Bot

What you get: All 13 analytics queries accessible from your phone via button menus. Plus an automatic daily KPI report at 08:00.

Setup

1. Create a Telegram bot

Open Telegram → search @BotFather → /newbot → follow prompts → copy the token.

2. Get your Telegram user ID

Search @userinfobot on Telegram → it replies with your numeric ID.

3. Configure environment variables

cp .env.example .env

Edit .env:

TELEGRAM_TOKEN=your_token_from_botfather
TELEGRAM_CHAT_ID=your_numeric_user_id

.env is in .gitignore — it will never be committed to GitHub.

4. Run the bot

python -m bot.bot

The bot trains all 7 models on startup (~30 sec), then begins listening. Keep the terminal open.

What the bot can do

Category	Queries available
📊 Overview	Plant Snapshot · Critical Machines
🔴 Availability	Predict Next Failure · MTBF Fleet · MTBF Single Machine
⚡ Performance	VSM × Year Evolution · Area Monthly Deviation · Weekly Forecast
⭐ Quality	FPY by VSM · Speed vs Quality · Defect Risk Predictor
📋 Automatic	Daily KPI Report at 08:00

Period options for every descriptive query:
Last 7 days · Last 30 days · Last 3 months · Last 6 months · Year to date · 2024 · 2023 · 2022

Commands:

Command	Description
`/start`	Open main menu
`/menu`	Return to main menu from anywhere
`/status`	Quick plant snapshot (last 30 days)
`/alerts`	Toggle the daily 08:00 KPI report on/off

See BOT_GUIDE.md for the complete query reference with sample outputs.

Phase 9 — Natural Language Queries

What you get: Type questions in plain Spanish or English — no menus, no buttons. The bot understands the question, calls the right query, and replies with real data.

Powered by Ollama + llama3.2 — runs 100% locally. No API keys, no internet connection required, no cost, data never leaves your machine.

Setup

1. Install Ollama

Download from ollama.com/download and install. Ollama starts automatically in the background.

2. Pull the model (one-time, ~2 GB)

ollama pull llama3.2

3. Run the bot — same command as Phase 8

python -m bot.bot

Natural language is active immediately alongside the button menus.

Example questions

"¿Cómo estuvo el OEE el mes pasado?"
"Cuáles son las peores máquinas en 2024?"
"Show me the speed vs quality tradeoff for Beta"
"Dame el FPY del último trimestre"
"What's the MTBF risk fleet-wide?"
"Dame el reporte de hoy"

How it works

User types question
       ↓
Ollama (llama3.2) identifies which query to call + what parameters
       ↓
Bot executes the real query against the CSV data
       ↓
Ollama generates a one-sentence natural language summary
       ↓
User receives: summary + full formatted data + menu buttons

Running Everything Together

The three processes are completely independent — each runs in its own terminal:

Terminal	Command	Port	Required for
1	`streamlit run app.py`	8501	Dashboard (F1–F6)
2	`uvicorn api.main:app --reload`	8000	REST API (F7)
3	`python -m bot.bot`	none	Telegram bot (F8–F9)

You can run any combination. They do not interfere with each other.

Key Design Decisions

Why a single app.py?
All ML models train via @st.cache_data on first load — no separate pipeline. For production scale (>1M rows), split into a src/ ETL layer with pre-trained model storage.

Why RandomForest over LSTM or XGBoost?
With 3 years × 35 machines × 3 shifts ≈ 100K rows, RandomForest gives interpretable, stable results without overfitting risk. Feature importance is human-readable. LSTM becomes relevant at sensor-level (1-second) granularity with continuous streaming data.

Why recency-weighted MTBF?
A machine with 10 failures three years ago but zero in the last 6 months should not carry the same risk as one that just failed last week. Exponential decay (half-life = 180 days) assigns higher weight to recent events and exposes ESS (Effective Sample Size) as a confidence indicator.

Why Ollama over a cloud LLM?
Manufacturing KPIs are confidential. Running the language model locally means production data never leaves the plant network. Ollama with llama3.2 also has zero ongoing cost, no rate limits, and no dependency on external availability.

Why no database?
CSV files are the universal export format of every MES, CMMS, and SCADA system. No ETL, no schema migration, no DBA required. When row count exceeds ~5M, swap pd.read_csv for a DuckDB or PostgreSQL connector — the rest of the stack stays identical.

Roadmap

LozanoLsa · Turning Operations into Predictive Systems

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OEE Intelligence — Implementation Guide

What This Project Builds

OEE Formula — Enforced at Every Row

Plant Structure (sample data)

Project Structure

Prerequisites — All Phases

Phase 0 — Data Setup

Option A — Use the sample data (already included)

Option B — Use your own plant data

Phase 1–6 — Streamlit Dashboard

Run

Tabs

ML Models

Phase 7 — FastAPI Backend

Run

Interactive docs

Example calls

Phase 8 — Telegram Bot

Setup

What the bot can do

Phase 9 — Natural Language Queries

Setup

Example questions

How it works

Running Everything Together

Key Design Decisions

Roadmap

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.streamlit		.streamlit
api		api
bot		bot
data		data
.env.example		.env.example
.gitignore		.gitignore
BOT_GUIDE.md		BOT_GUIDE.md
LICENSE		LICENSE
README.md		README.md
app.py		app.py
oee_data_dictionary.md		oee_data_dictionary.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

OEE Intelligence — Implementation Guide

What This Project Builds

OEE Formula — Enforced at Every Row

Plant Structure (sample data)

Project Structure

Prerequisites — All Phases

Phase 0 — Data Setup

Option A — Use the sample data (already included)

Option B — Use your own plant data

Phase 1–6 — Streamlit Dashboard

Run

Tabs

ML Models

Phase 7 — FastAPI Backend

Run

Interactive docs

Example calls

Phase 8 — Telegram Bot

Setup

What the bot can do

Phase 9 — Natural Language Queries

Setup

Example questions

How it works

Running Everything Together

Key Design Decisions

Roadmap

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages