Skip to content

Conversation

@CTTY
Copy link
Collaborator

@CTTY CTTY commented Oct 17, 2025

Which issue does this PR close?

What changes are included in this PR?

Are these changes tested?

Added some tests for the new storage builder and registry, but mostly relying on the existing tests

}

#[async_trait]
impl Storage for OpenDALGcsStorage {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be pretty annoying to implement nearly the same thing for all storage services, can we avoid that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this thread is relavant: https://docs.google.com/document/d/1-CEvRvb52vPTDLnzwJRBx5KLpej7oSlTu_rg0qKEGZ8/edit?disco=AAABrRO9Prk

The benefit of doing this is that users are allowed to only implement Storage for certain schemes. The annoying part of having duplicate code for multiple schemes will mostly apply to a versatile storage implementation like OpenDAL, which already has a convenient operator layer. For custom storage, I don't expect them to implement all schemes anyway (I may be wrong on this assumption)

For code duplication, I consider OpenDAL Storage to be the "managed" default storage that lives in this repo and we will have more control over the implementation. Once we have a new crate for each storage implementation(iceberg-storage-opendal), we can add some helpers to reduce the code duplication

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm keeping the current implementation as unified storage (one OpenDalStorage for all backends). We can split it up later if needed

@Xuanwo
Copy link
Member

Xuanwo commented Nov 3, 2025

Hi, @liurenjie1024 and @CTTY, I have thinked about this again. I think we can split the concept around fileio and fileio-provider (or just using the words in this PR, Storage and Fileio).

As discussed in #1797, sometimes people don't care about FileIO (the thing in java) at all. They will initiate, build and manage their own IO abstraction and only want to be used as Storage in the iceberg-rust.

So, I think we should make FileIO optional in the future and only depends on Storage. However it can be tricky since we do need the power to build a storage from s3://bucket/name. We can discuss them face to face and come up with a proposal.

@liurenjie1024
Copy link
Contributor

Hi, @liurenjie1024 and @CTTY, I have thinked about this again. I think we can split the concept around fileio and fileio-provider (or just using the words in this PR, Storage and Fileio).

As discussed in #1797, sometimes people don't care about FileIO (the thing in java) at all. They will initiate, build and manage their own IO abstraction and only want to be used as Storage in the iceberg-rust.

So, I think we should make FileIO optional in the future and only depends on Storage. However it can be tricky since we do need the power to build a storage from s3://bucket/name. We can discuss them face to face and come up with a proposal.

I'm a little confused about your point, would you mind to write a more detailed proposal?

@Xuanwo
Copy link
Member

Xuanwo commented Nov 3, 2025

I'm a little confused about your point, would you mind to write a more detailed proposal?

Will do

@CTTY CTTY force-pushed the ctty/storage-trait branch from 322ae2a to 459dc73 Compare December 11, 2025 20:51
@CTTY CTTY marked this pull request as ready for review December 11, 2025 20:59
@liurenjie1024
Copy link
Contributor

I have concerns about this pr, where it leaves whole crate in a state inconsisteny with our design. I think to following our design, you don't need to change anything existing. We could name our new trait to be Storage2, and adding new implementations gradually. When we reached feature parity in the new storage implemenation, we could use one pr to switch to the new storage implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make FileIO a Trait

4 participants