Route FileIO through Iceberg runtime by geoffreyclaude · Pull Request #22 · DataDog/iceberg-rust

geoffreyclaude · 2026-06-05T12:43:56Z

Summary

This adds runtime-aware FileIO construction for callers that keep storage IO on a dedicated Tokio runtime.

When an IO runtime is configured, FileIO still caches the raw backend storage, but exposes operations through a private RuntimeStorage adapter. Direct storage calls, reader creation, byte-range reads through FileRead::read(range), writer creation, writer chunks, and writer close are dispatched through runtime.io().

The public API has two entry points:

FileIO::with_runtime(runtime) / FileIOBuilder::with_runtime(runtime) for callers that already have a full Iceberg Runtime.
FileIO::with_io_runtime(handle) / FileIOBuilder::with_io_runtime(handle) for callers that only need to route storage IO.

RestCatalogBuilder::with_file_io_runtime(handle) uses the IO-only path for long-lived REST catalogs without assigning a full table runtime. Existing CatalogBuilder::with_runtime(runtime) behavior remains the full-runtime path for loaded tables.

Concrete storage backends remain runtime-agnostic.

Runtime Impact

This PR intentionally routes storage operations, not every piece of Iceberg metadata processing.

Callers with separate runtimes can pass explicit handles with Runtime::new_with_handles(io, cpu). The storage adapter uses the IO half for storage scheduling.

Data-file Parquet scan CPU stays where the returned RecordBatchStream is polled. For DataFusion callers, decode, decompression, row filtering, projection, and Iceberg batch transformation remain on the query runtime polling the stream, while byte-range reads run through the IO runtime.

Catalog-backed DataFusion provider paths now use the IO runtime only for catalog reloads. Scan building and plan_files() collection stay on the caller runtime; loaded tables still carry the Iceberg runtime, so manifest and FileIO operations dispatch through their own runtime-aware paths.

Manifest planning and delete metadata processing keep their existing scheduling behavior. Existing Iceberg tasks that already use runtime.cpu() continue to use the supplied CPU handle, but this PR does not add a custom CPU spawn/accounting hook and does not broaden metadata offloading.

Shape

Before, callers that wanted storage IO off the query runtime had to move the whole scan stream onto IO:

DataFusion / query runtime
  |
  +-- IOExec / wrapper
        |
        v
      IO runtime
        |
        +-- scan stream polling
        |     --> Parquet decode/decompression
        |     --> row filtering / projection
        |     --> Iceberg batch transformation
        |
        +-- FileIO / raw Storage backend
              --> byte-range reads

After, only storage work crosses into the IO runtime; scan CPU remains on the caller runtime:

DataFusion / caller runtime
  |
  +-- scan planning and stream polling
  |     --> Parquet decode/decompression
  |     --> row filtering / projection
  |     --> Iceberg batch transformation
  |
  +-- FileIO / InputFile / OutputFile
        |
        v
      RuntimeStorage (private adapter)
        |
        +-- runtime.io() --> raw Storage backend
                         --> FileRead::read(range)
                         --> FileWrite::{write, close}

Why This Layer

Parquet readers perform byte-range reads through Iceberg FileRead objects after scan planning. Routing at the storage adapter layer covers those IO operations without moving the whole DataFusion scan stream onto the IO runtime and without making concrete storage backends runtime-aware.

The REST catalog IO-only hook is for catalog construction paths that should route FileIO storage work to IO without assigning a long-lived CPU runtime to every loaded table. Query-created tables can still be rebound later with a full runtime.

Validation

cargo fmt --check
cargo check -p iceberg -p iceberg-catalog-rest -p iceberg-storage-opendal -p iceberg-datafusion --locked
cargo test -p iceberg file_io --locked
cargo test -p iceberg test_runtime_with_handles_uses_explicit_cpu_handle --locked
cargo test -p iceberg-catalog-rest test_load_table_with_file_io_runtime_routes_storage_to_io --locked
cargo test -p iceberg-datafusion test_catalog_backed_provider --locked
cargo test -p iceberg test_plan_files --locked

datadog-datadog-prod-us1-2 · 2026-06-05T12:44:24Z

✨ Fix all issues with BitsAI

⚠️ Warnings

🚦 1 Pipeline job failed

GitHub Actions Security Analysis with zizmor 🌈 | Run zizmor 🌈

Useful? React with 👍 / 👎

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 7ae0d92 | Docs | Datadog PR Page | Give us feedback!}

geoffreyclaude · 2026-06-06T18:17:26Z

@codex review

chatgpt-codex-connector · 2026-06-06T18:21:54Z

Codex Review: Didn't find any major issues. Chef's kiss.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

gabotechs · 2026-06-08T07:34:13Z

Any chance to contribute something like this upstream? I can imagine how this is a problem that anyone with CPU/IO runtime separation will have in the community

geoffreyclaude · 2026-06-08T07:37:34Z

Any chance to contribute something like this upstream? I can imagine how this is a problem that anyone with CPU/IO runtime separation will have in the community

@gabotechs That's the goal for sure. But as discussed with @toutane about contributing #20 upstream, the runtime changes in our fork are currently depending on apache#2298 , which isn't merged upstream yet.

gabotechs

👍 Sounds good. Before merging, I'd probably leave @toutane a chance to review this, as he is more familiar with this code.

Just skimmed through it, but saw nothing obviously wrong, so +1 for me

geoffreyclaude · 2026-06-08T08:40:17Z

@codex review

chatgpt-codex-connector · 2026-06-08T08:43:56Z

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

toutane

Looks very good in deed!

I'm not very familiar with the Storage trait but I think your wrapper looks very neat.

I think a refactor is now possible in DataFusion's table/mod.rs to more precisely separate what needs to run on the IO vs. the CPU runtime. Wdyt?

toutane · 2026-06-08T08:57:54Z

            .run_on_io(async move {
-                catalog
+                let table = catalog
                    .load_table(&table_ident)


Same thing here, we could use load_table_on_io now and get rid of the run_on_io call.

Codex here: agreed. I updated insert_into to use load_table_on_io directly and removed run_on_io entirely.

The new shape is:

let table = Self::load_table_on_io( self.catalog.clone(), self.table_ident.clone(), self.runtime.as_ref(), ) .await .map_err(to_datafusion_error)?;

toutane · 2026-06-08T09:25:41Z

I think the run_on_io wrapping here is now redundant, and maybe even counter-productive given that load_table_on_io is available.

The only operations in this closure that genuinely benefit from being routed to the IO runtime are the load_table call and get_manifest_list inside plan_files.

Either we modify plan_files to spawn get_manifest_list on the IO runtime itself, or we narrow run_on_io down to just wrapping the plan_files call.

Happy to address it in a follow-up PR if it sounds good to you!

That's a great point! Looks like we can indeed get rid of run_on_io completely, which will simplify things quite a bit.

Codex here: agreed, and I took the narrower version now instead of leaving it for a follow-up.

scan() now routes only the catalog refresh through load_table_on_io; scan building and plan_files() collection stay on the caller runtime. The loaded Table still carries the Iceberg runtime, so manifest/FileIO operations dispatch through their own runtime-aware paths.

let table = Self::load_table_on_io(...).await.map_err(to_datafusion_error)?; let tasks: Vec<FileScanTask> = builder .build() .map_err(to_datafusion_error)? .plan_files() .await .map_err(to_datafusion_error)? .try_collect::<Vec<_>>() .await .map_err(to_datafusion_error)?;

toutane · 2026-06-08T09:39:13Z

Any chance to contribute something like this upstream? I can imagine how this is a problem that anyone with CPU/IO runtime separation will have in the community

@gabotechs That's the goal for sure. But as discussed with @toutane about contributing #20 upstream, the runtime changes in our fork are currently depending on apache#2298 , which isn't merged upstream yet.

@geoffreyclaude I think what we're lacking upstream is the try_new_with_runtime constructor (table/mod.rs).
It was added in https://github.com/DataDog/iceberg-rust/pull/20/changes, which itself depends on apache#2298, as it also brings the run_on_io wrapper for the scan method.

One possible solution would be to include try_new_with_runtime directly in the upstream PR?

geoffreyclaude · 2026-06-08T10:50:25Z

Codex update, replying to #22 (comment):

@toutane @gabotechs we now have a dedicated upstream path for this, independent of apache#2298:

Upstream tracking issue: Add Tokio Runtime Handle Configuration for OpenDAL Executor apache/iceberg-rust#1945
Upstream draft PR: feat(io): route FileIO through configured runtime apache/iceberg-rust#2602

The draft PR includes the DataFusion try_new_with_runtime constructors discussed here, plus the FileIO/Storage runtime routing needed to keep storage work on runtime.io(). It deliberately does not change scan partitioning, eager file planning, or DataFusion physical-plan shape.

Could you both take an initial look when you have a chance, especially at whether the API shape matches the direction you had in mind?

geoffreyclaude force-pushed the geoffrey.claude/runtime-storage-io branch 4 times, most recently from 8e8f82a to 878a9a9 Compare June 5, 2026 14:13

geoffreyclaude marked this pull request as ready for review June 5, 2026 14:16

geoffreyclaude force-pushed the geoffrey.claude/runtime-storage-io branch 5 times, most recently from c96a871 to 4c1d3b1 Compare June 6, 2026 11:40

geoffreyclaude force-pushed the geoffrey.claude/runtime-storage-io branch 3 times, most recently from 9a14913 to 6e0b315 Compare June 7, 2026 15:04

gabotechs approved these changes Jun 8, 2026

View reviewed changes

geoffreyclaude force-pushed the geoffrey.claude/runtime-storage-io branch from 6e0b315 to a434203 Compare June 8, 2026 08:31

toutane approved these changes Jun 8, 2026

View reviewed changes

Route FileIO through Iceberg runtime

7ae0d92

geoffreyclaude force-pushed the geoffrey.claude/runtime-storage-io branch from a434203 to 7ae0d92 Compare June 8, 2026 09:48

gh-worker-dd-devflow-36fce6 Bot added mergequeue-status: queued and removed mergequeue-status: queued labels Jun 8, 2026

geoffreyclaude merged commit 79b97fd into DataDog:branch-0.9 Jun 8, 2026
20 of 22 checks passed

gh-worker-dd-devflow-36fce6 Bot added mergequeue-status: in_progress mergequeue-status: done and removed mergequeue-status: in_progress labels Jun 8, 2026

Conversation

geoffreyclaude commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Runtime Impact

Shape

Why This Layer

Validation

Uh oh!

datadog-datadog-prod-us1-2 Bot commented Jun 5, 2026 • edited by datadog-official Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Warnings

Uh oh!

geoffreyclaude commented Jun 6, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 6, 2026

Uh oh!

gabotechs commented Jun 8, 2026

Uh oh!

geoffreyclaude commented Jun 8, 2026

Uh oh!

gabotechs left a comment

Choose a reason for hiding this comment

Uh oh!

geoffreyclaude commented Jun 8, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 8, 2026

Uh oh!

toutane left a comment

Choose a reason for hiding this comment

Uh oh!

toutane Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

geoffreyclaude Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

toutane Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

geoffreyclaude Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

geoffreyclaude Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

toutane commented Jun 8, 2026

Uh oh!

geoffreyclaude commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

geoffreyclaude commented Jun 5, 2026 •

edited

Loading

datadog-datadog-prod-us1-2 Bot commented Jun 5, 2026 •

edited by datadog-official Bot

Loading