feat(rest): vend storage credentials into per-table FileIO#1
Closed
huan233usc wants to merge 1 commit into
Closed
Conversation
The REST catalog declared the `.../credentials` endpoint path but `LoadTableResult` discarded the `storage-credentials` the server returns, so credential-vending catalogs could not access table data end to end. - Add a `StorageCredential` type and parse the `storage-credentials` list on `LoadTableResult` (load/create/register/stage responses), with JSON round-trip support. - Add `MakeTableFileIO`, which builds a table-scoped FileIO by layering the most-specific (longest-prefix) vended credential and the table's `config` on top of the catalog configuration, reusing the shared catalog FileIO when there are no overrides. This also resolves the per-table FileIO FIXME in `RestCatalog::LoadTable`. - Wire it through LoadTable / CreateTable / RegisterTable / StageCreateTable. - Unit tests for serde and for credential selection / FileIO layering. Co-authored-by: Isaac
39348b8 to
c016eb5
Compare
Owner
Author
|
Superseded by the upstream PR: apache#729 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
The REST catalog already declared the
.../namespaces/{ns}/tables/{tbl}/credentialsendpoint path, but
LoadTableResultdiscarded thestorage-credentialsthe serverreturns. As a result, catalogs that vend per-table storage credentials could not be
used end to end — the client had no credentials to read/write the table's data files.
This wires up vended storage credentials and, in doing so, resolves the existing
FIXME: support per-table FileIO creationinRestCatalog::LoadTable.What changed
StorageCredentialtype (prefix+config) and a parsedLoadTableResult.storage_credentialslist, with JSON round-trip support incatalog/rest/json_serde.cc.MakeTableFileIO(catalog/rest/rest_file_io.cc): builds a table-scoped FileIO bylayering, in increasing precedence, the catalog configuration → the table's
config→the most-specific (longest-prefix) vended credential, following the REST spec guidance
to "choose the most specific prefix". When the table supplies no overrides, the shared
catalog FileIO instance is reused. The existing
MakeCatalogFileIOis untouched, sothere is no behavior change for non-vending servers.
MatchStorageCredential: longest-prefix selection helper.LoadTable,CreateTable,RegisterTable, andStageCreateTable.(
UpdateTableis intentionally left alone — itsCommitTableResponsedoes not carrystorage credentials in the spec.)
Known limitations / follow-ups
Surfaced during self-review; all are pre-existing structural gaps that this PR narrows
but does not fully close:
UpdateTablestill returns aTablebound to the shared catalogFileIO (the spec's
CommitTableResponsecarries no credentials), so a table obtainedfrom a commit loses vended credentials that the equivalent
LoadTablewould have.and baked into a single FileIO. Tables whose data/metadata span multiple credential
prefixes (e.g. split
write.data.path/write.metadata.path) would need per-pathcredential routing (Java routes per-prefix inside
S3FileIO).tokens; there is no refresh hook yet. The
.../credentialsendpoint(
Endpoint::TableCredentials()) exists and is the natural follow-up for refresh.table config or credentials; a cache keyed by merged properties would avoid repeated
S3 client construction.
X-Iceberg-Access-Delegation: vended-credentialsbydefault (Java does); servers that gate vending on that header require it to be set
via the
header.catalog property for now.Testing
Built with CMake + Ninja (RelWithDebInfo, Clang) against Arrow 24.0.0.
storage-credentials(incl. multiplecredentials);
MatchStorageCredentiallongest-prefix / no-match;MakeTableFileIOreuses the catalog FileIO when there are no overrides, and appliesthe most-specific vended credential over both the catalog and table config.
LoadTableparses thevended credentials and
MakeTableFileIOselects and applies them when constructingthe per-table FileIO.
This pull request and its description were written by Isaac.