Some data engineering experiments with Apache Iceberg as the lakehouse table format
This project uses the following services, as defined in the docker-compose.yaml file:
- Trino as query engine
- Nessie as a Iceberg REST catalog with:
- PostgreSQL as JDBC store
- MinIo as S3-compatible storage layer
For the love of God in your soul, use UV to run it or at least as a pip interface, or those damned rust dependencies will break every single thing in those scripts
- pyiceberg-rest-catalog.py: Example of schema/catalog setup with pyiceberg
- pyiceberg-duckdb.py: Example using DuckDB to insert and query data from a Iceberg Table. Depends on:
- pyiceberg-rest-catalog.py