Skip to content

AWS: restCredentialsProvider should return StsAssumeRoleCredentialsProvider when assume role configurations are set #16667

@NathanCYee

Description

@NathanCYee

Feature Request / Improvement

Currently, RESTSigV4AuthSession calls the restCredentialsProvider function in AwsProperties to retrieve the credentials provider to sign the sigv4 request.

The restCredentialsProvider function currently follows the following decision chain:

  1. If accessKeyId, secretAccessKey, and optionally sessionToken are set -> return a StaticCredentialsProvider
  2. If clientCredentialsProvider is set, build the class using DynClasses and pass the property map clientCredentialsProviderProperties into the create function of the class
  3. Otherwise return the default credential chain using DefaultCredentialsProvider.

However, this decision chain does not consider the credentials if client.factory=org.apache.iceberg.aws.AssumeRoleAwsClientFactory. The factory will define a role to assume that ISN'T used on REST catalog requests but IS used for S3FileIO, Glue Catalog, KMS, and DynamoDB operations causing divergent permissions for REST catalog operations.

Workarounds attempted:
AwsProperties allows the definition of alternative credential providers under client.credentials-provider. When this property is set, restCredentialsProvider will return the custom provider instead of the default chain.
However, this property is not compatible with software.amazon.awssdk.services.sts.auth.StsAssumeRoleCredentialsProvider since this property requires the credentials provider to have a create() or create(Map) function, while the assume role provider uses the builder pattern.

Potential Design:
Fastest path would be to update the credentialsProvider(String accessKeyId, String secretAccessKey, String sessionToken) function of AwsProperties to create and return a StsAssumeRoleCredentialsProvider if this.clientAssumeRoleArn is set. The AwsProperties class would be duplicating the createCredentialsProvider function of AssumeRoleAwsClientFactory. This could go between step 2 and 3 of the decision chain of restCredentialsProvider (before returning default credentials chain, check for clientAssumeRoleArn first and if it is set return the assume role credentials provider).

A more complete path would be to define another interface (possibly named CanCreateAwsCredentialsProvider?) that has a public function AwsCredentialsProvider createCredentialsProvider(). AssumeRoleAwsClientFactory can implement this class (and possibly other client factories such as LakeFormationAwsClientFactory). When CLIENT_FACTORY is set to a class that inherits this interface, instantiate the client factory and call the function between step 2 and 3 of the decision chain of restCredentialsProvider.

Third possible design: extract the four functions (sts, genSessionName, createCredentialsProvider, createAssumeRoleRequest) from AssumeRoleAwsClientFactory into a separate public utility class. Embed the utility class into both AwsProperties and AssumeRoleAwsClientFactory. Add a set of additional rest. properties for assume role as well.

Query engine

Spark

Willingness to contribute

  • I can contribute this improvement/feature independently
  • I would be willing to contribute this improvement/feature with guidance from the Iceberg community
  • I cannot contribute this improvement/feature at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    AWSimprovementPR that improves existing functionality

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions