Skip to content

feat(io): route FileIO through configured runtime#2602

Open
geoffreyclaude wants to merge 3 commits into
apache:mainfrom
geoffreyclaude:geoffrey.claude/runtime-file-io
Open

feat(io): route FileIO through configured runtime#2602
geoffreyclaude wants to merge 3 commits into
apache:mainfrom
geoffreyclaude:geoffrey.claude/runtime-file-io

Conversation

@geoffreyclaude

@geoffreyclaude geoffreyclaude commented Jun 8, 2026

Copy link
Copy Markdown

Which issue does this PR close?

What changes are included in this PR?

This PR routes file storage work through the IO handle of an explicitly configured Iceberg Runtime.

The original runtime-segregation discussion in #1945 proposed configuring an OpenDAL executor directly. Since then, #2308 introduced Iceberg's Runtime abstraction and catalog-level runtime plumbing. This PR builds on that newer API and applies runtime dispatch at the FileIO/Storage boundary instead, so concrete storage implementations remain runtime-agnostic and the Storage trait is unchanged.

When a runtime is configured on FileIO, the raw storage backend is kept unchanged and wrapped in a private runtime-aware adapter. The adapter dispatches storage operations, reader creation, byte-range reads, writer creation, writes, and close operations onto runtime.io().

Catalogs now propagate their configured runtime into the FileIO instances they build, and loaded Table values can be rebound with Table::with_runtime so object-cache reads use the same runtime-aware FileIO. The DataFusion catalog-backed provider also has runtime-aware constructors, so table loads can use the runtime-aware path while scan streams continue to be polled by DataFusion as before.

This deliberately does not change scan partitioning, eager file planning, or the DataFusion physical plan shape.

Thanks to @toutane for the preliminary work and discussion around runtime-aware Iceberg/DataFusion execution that helped shape this direction.

Disclosure: this PR was implemented with assistance from Codex and reviewed before submission.

Are these changes tested?

  • cargo fmt --check
  • cargo test -p iceberg test_file_io_with_runtime_routes_storage_operations --locked
  • cargo test -p iceberg file_io --locked
  • cargo test -p iceberg-datafusion test_catalog_backed_provider --locked
  • cargo check -p iceberg -p iceberg-catalog-rest -p iceberg-catalog-glue -p iceberg-catalog-hms -p iceberg-catalog-s3tables -p iceberg-catalog-sql -p iceberg-storage-opendal -p iceberg-datafusion --locked
  • cargo public-api -p iceberg --all-features -ss | diff - crates/iceberg/public-api.txt
  • cargo public-api -p iceberg-datafusion --all-features -ss | diff - crates/integrations/datafusion/public-api.txt

@geoffreyclaude

Copy link
Copy Markdown
Author

I missed #1945 when opening #2601. I have updated this PR to use #1945 as the primary tracking issue, and closed #2601 as a narrower duplicate/restatement so the runtime-segregation discussion stays consolidated.

@geoffreyclaude geoffreyclaude force-pushed the geoffrey.claude/runtime-file-io branch from 467cec5 to 481b283 Compare June 11, 2026 06:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Tokio Runtime Handle Configuration for OpenDAL Executor

1 participant