✨ Enhancement Request
Summary:
Add an S3 source connector that supports syncing structured data formats from S3.
Problem / Use Case:
I have an S3 data lake and I want to use those tables as multi-tenant or single-tenant data sources for models in Pontoon.
Proposed Solution:
- Start with support for traditional Hive partitioning and compressed Parquet files, e.g.
s3://my-bucket/<namespace>/<schema>/<table>/<tenant-id=abc>/date=2025-01-01/
- Add support for additional formats: JSON/NDJSON, ORC, Avro
- Add support for reading transactional table formats: Iceberg, Delta, Hudi, S3 Tables
Alternatives Considered:
- No viable workarounds/alternatives right now
Impact / Importance:
Additional Context (optional):
- Part of a series of enhancements on supporting object connectors as sources and destinations
✨ Enhancement Request
Summary:
Add an S3 source connector that supports syncing structured data formats from S3.
Problem / Use Case:
I have an S3 data lake and I want to use those tables as multi-tenant or single-tenant data sources for models in Pontoon.
Proposed Solution:
s3://my-bucket/<namespace>/<schema>/<table>/<tenant-id=abc>/date=2025-01-01/Alternatives Considered:
Impact / Importance:
Additional Context (optional):