End-to-end data platform for processing, analyzing and visualizing fleet drivers violation reports.
This project demonstrates a complete data engineering pipeline:
- file upload API
- ETL pipeline in Python
- PostgreSQL data warehouse
- BI dashboards in Metabase
- Dockerized infrastructure
Excel files
β
FastAPI (/upload)
β
ETL Pipeline (Python)
β
PostgreSQL
β
Metabase Dashboards
- π€ Upload legacy
.xlsreports via API - π Automatic ETL processing after upload
- π§Ή Idempotent processing (same file is not imported twice)
- ποΈ PostgreSQL as data warehouse
- π Metabase dashboards:
- drivers overview
- violations per driver
- driver profile (per company / file)
- π REST API:
/drivers/drivers/{id}/profile/upload
- Python 3.12
- FastAPI
- Pandas
- SQLAlchemy
- PostgreSQL
- Metabase
- Docker / Docker Compose
fleet-data-platform/
βββ services/
β βββ api/
β β βββ app/
β β βββ main.py
β β βββ db.py
β β βββ models.py
β βββ etl/
β β βββ incoming/
β β βββ parsers/
β β β βββ legacy_xls_parser.py
β β βββ etl_pipeline.py
β β βββ models.py
β βββ venv/
βββ storage/
β βββ postgres/
β βββ metabase/
βββ docker-compose.yml
βββ README.md
