Skip to content

Conversation

@okumin
Copy link
Contributor

@okumin okumin commented Sep 18, 2025

What changes were proposed in this pull request?

Add RFC 7662 or RFC 9068-based OAuth 2 support.

https://issues.apache.org/jira/browse/HIVE-29020

Why are the changes needed?

OAuth 2 is the most common authorization method for protecting the Iceberg REST Catalog. As the Iceberg library officially supports it, Iceberg users would be able to use it without custom patches if we were to support it. Additionally, OAuth 2 is an industry-standard protocol as of 2025.

The security principles and considerations are written in the following document.

https://docs.google.com/document/d/1wOlmpKP4jZb4Je67wusdMoQ4VWgGfi5JLe8w3MP5mpE/edit?usp=sharing

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added integration tests.

@github-actions
Copy link

github-actions bot commented Sep 18, 2025

@check-spelling-bot Report

🔴 Please review

See the files view or the action log for details.

Unrecognized words (6)

calcualtion
Chrono
getenv
ntz
OOM
unsign

Previously acknowledged words that are now absent www
To accept these unrecognized words as correct (and remove the previously acknowledged and now absent words), run the following commands

... in a clone of the git@github.com:okumin/hive.git repository
on the HIVE-29020-oauth2 branch:

update_files() {
perl -e '
my @expect_files=qw('".github/actions/spelling/expect.txt"');
@ARGV=@expect_files;
my @stale=qw('"$patch_remove"');
my $re=join "|", @stale;
my $suffix=".".time();
my $previous="";
sub maybe_unlink { unlink($_[0]) if $_[0]; }
while (<>) {
if ($ARGV ne $old_argv) { maybe_unlink($previous); $previous="$ARGV$suffix"; rename($ARGV, $previous); open(ARGV_OUT, ">$ARGV"); select(ARGV_OUT); $old_argv = $ARGV; }
next if /^(?:$re)(?:(?:\r|\n)*$| .*)/; print;
}; maybe_unlink($previous);'
perl -e '
my $new_expect_file=".github/actions/spelling/expect.txt";
use File::Path qw(make_path);
use File::Basename qw(dirname);
make_path (dirname($new_expect_file));
open FILE, q{<}, $new_expect_file; chomp(my @words = <FILE>); close FILE;
my @add=qw('"$patch_add"');
my %items; @items{@words} = @words x (1); @items{@add} = @add x (1);
@words = sort {lc($a)."-".$a cmp lc($b)."-".$b} keys %items;
open FILE, q{>}, $new_expect_file; for my $word (@words) { print FILE "$word\n" if $word =~ /\w/; };
close FILE;
system("git", "add", $new_expect_file);
'
}

comment_json=$(mktemp)
curl -L -s -S \
-H "Content-Type: application/json" \
"https://api.github.com/repos/apache/hive/issues/comments/3305215479" > "$comment_json"
comment_body=$(mktemp)
jq -r ".body // empty" "$comment_json" > $comment_body
rm $comment_json

patch_remove=$(perl -ne 'next unless s{^</summary>(.*)</details>$}{$1}; print' < "$comment_body")

patch_add=$(perl -e '$/=undef; $_=<>; if (m{Unrecognized words[^<]*</summary>\n*```\n*([^<]*)```\n*</details>$}m) { print "$1" } elsif (m{Unrecognized words[^<]*\n\n((?:\w.*\n)+)\n}m) { print "$1" };' < "$comment_body")

update_files
rm $comment_body
git add -u
If the flagged items do not appear to be text

If items relate to a ...

  • well-formed pattern.

    If you can write a pattern that would match it,
    try adding it to the patterns.txt file.

    Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

    Note that patterns can't match multiline strings.

  • binary file.

    Please add a file path to the excludes.txt file matching the containing file.

    File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

    ^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

}

enum Route {
TOKENS(HTTPMethod.POST, "v1/oauth/tokens", null),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial specification of Iceberg REST allows the Iceberg REST Catalog to act as an Authorization Server, which is why the list contains this endpoint. As this does not follow security best practices, it will be removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be removed with Iceberg spec 2.0; at this point, it may be deprecated but should not be removed yet, should it ?

Copy link
Contributor Author

@okumin okumin Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should.
The token endpoint is not of a Resource Server but of an Authorization Server, i.e., typically Identity Provider such as Keycloak or Okta. The token endpoint is responsible for issuing a secure(e.g., cryptographically signed) Access Token.
Our endpoint is just a copied and pasted from iceberg-core, which is almost like an echo server(no authentication happens). It is possible to embed the Authorization Server roles in HMS; however, we would need to implement an RFC 6749, RFC 9068, or another RFC-compliant Authorization Server in that case. Otherwise, no user exists(who wants OAuth 2 that allows unauthenticated access?).

private OAuthTokenResponse tokens(Object body) {
@SuppressWarnings("unchecked")
Map<String, String> request = (Map<String, String>) castRequest(Map.class, body);
String grantType = request.get(GRANT_TYPE);
switch (grantType) {
case CLIENT_CREDENTIALS:
return OAuthTokenResponse.builder()
.withToken("client-credentials-token:sub=" + request.get(CLIENT_ID))
.withIssuedTokenType(URN_OAUTH_ACCESS_TOKEN)
.withTokenType(BEARER)
.build();
case URN_OAUTH_TOKEN_EXCHANGE:
String actor = request.get(ACTOR_TOKEN);
String token =
String.format(
"token-exchange-token:sub=%s%s",
request.get(SUBJECT_TOKEN), actor != null ? ",act=" + actor : "");
return OAuthTokenResponse.builder()
.withToken(token)
.withIssuedTokenType(URN_OAUTH_ACCESS_TOKEN)
.withTokenType(BEARER)
.build();

LOG.warn("Rejecting an expired JWT: {}", parsedJwt.getPayload());
throw new AuthenticationException("JWT (ends with " + lastSevenChars + ") has been expired");
}
public JWTValidator(Set<JOSEObjectType> acceptableTypes, List<URL> jwksURLs, String expectedIssuer,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced our custom implementation, following the article below.
https://connect2id.com/products/nimbus-jose-jwt/examples/validating-jwt-access-tokens

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.withEnv("KEYCLOAK_ADMIN","admin")
.withEnv("KEYCLOAK_ADMIN_PASSWORD","admin")
.withCommand("start-dev")
.withExposedPorts(8080);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially tried to use spring-security-oauth2-authorization-server, but gave it up because it has a hard dependency conflict on spring-core derived from Apache Atlas, and Nimbus we're using.

Copy link
Member

@deniskuzZ deniskuzZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1, thanks @okumin for adding OAuth2 support in RestCatalog
pending tests

@okumin
Copy link
Contributor Author

okumin commented Sep 28, 2025

I'm testing this in my local, and going to add how-to-setup to hive-site. I will merge this after that

@sonarqubecloud
Copy link

Copy link
Contributor

@henrib henrib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove the oauth/tokens endpoint yet ?

}

enum Route {
TOKENS(HTTPMethod.POST, "v1/oauth/tokens", null),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be removed with Iceberg spec 2.0; at this point, it may be deprecated but should not be removed yet, should it ?

@okumin okumin merged commit e44cf34 into apache:master Sep 30, 2025
2 checks passed
@okumin okumin deleted the HIVE-29020-oauth2 branch September 30, 2025 04:31
@okumin
Copy link
Contributor Author

okumin commented Sep 30, 2025

@deniskuzZ @henrib Thanks for your reviews!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants