Skip to content

feat(python): Python binding for iceberg-rust FileIO#2

Draft
abnobdoss wants to merge 7 commits into
mainfrom
fileio-binding-poc
Draft

feat(python): Python binding for iceberg-rust FileIO#2
abnobdoss wants to merge 7 commits into
mainfrom
fileio-binding-poc

Conversation

@abnobdoss

@abnobdoss abnobdoss commented May 24, 2026

Copy link
Copy Markdown
Owner

Status:
Deferred fork-only draft.

Summary:
Exposes iceberg-rust FileIO concepts to Python as runtime classes for later scan and table bindings.

Scope:
Runtime binding only. Typing stubs are handled separately.

No external issue references.

Abanoub Doss added 6 commits May 24, 2026 16:24
…ore_rust

Add `bytes = "1"` to the Python binding's Cargo.toml (needed for
explicit byte-slice conversion in file_io.rs) and register
file_io::register_module in lib.rs, placing it alongside the existing
transform/manifest registrations.
…-rust FileIO

Exposes iceberg-rust's `FileIO` to Python via three pyclasses:

- `FileIO.from_props(dict)` — primary constructor matching the same
  OpenDalResolvingStorageFactory plumbing already used by
  IcebergDataFusionTable, now returning a reusable handle instead of
  discarding after construction. Callers amortize setup across thousands
  of file opens in a single query.
- `FileIO.exists(path)` / `FileIO.delete(path)` — async ops via the
  shared Tokio runtime handle.
- `FileIO.new_input(path)` / `FileIO.new_output(path)` — sync
  (InputFile/OutputFile hold the storage Arc internally).
- `InputFile.read()` → `bytes`, `InputFile.exists()`, `InputFile.metadata()` → dict.
- `OutputFile.write(bytes)` — one-shot write.
- `__repr__` on FileIO redacts any key containing secret/key/token/password/credential/passphrase.
Bare signatures for FileIO, InputFile, and OutputFile with a module-level
docstring explaining from_props(dict) as the primary constructor and the
credential-redaction behaviour of __repr__.
30 tests covering:
- from_props construction (empty dict, partial props, handle independence)
- __repr__ credential redaction for 7 sensitive key patterns
- exists/delete via FileIO
- OutputFile.write (create, overwrite, empty bytes)
- InputFile.exists, read, metadata
- round-trip write→read
- repr format for InputFile and OutputFile

All tests use tmp_path for filesystem isolation; no network deps.
@abnobdoss abnobdoss force-pushed the fileio-binding-poc branch from f4c0bcc to 9cb6fd8 Compare June 3, 2026 00:26
@abnobdoss abnobdoss force-pushed the fileio-binding-poc branch from 9cb6fd8 to 7fadc05 Compare June 3, 2026 00:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant