Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions fluss-filesystems/fluss-fs-s3/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@

<properties>
<fs.s3.aws.version>1.12.319</fs.s3.aws.version>
<!-- used by hadoop-aws 3.4.0 -->
<aws-java-sdk-v2.version>2.23.19</aws-java-sdk-v2.version>
</properties>

<dependencies>
Expand Down Expand Up @@ -233,8 +235,18 @@
<groupId>org.slf4j</groupId>
<artifactId>slf4j-reload4j</artifactId>
</exclusion>
<exclusion>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://hadoop.apache.org/release/3.4.1.html
"We have also introduced a lean tar which is a small tar file that does not contain the AWS SDK because the size of AWS SDK is itself 500 MB. This can ease usage for non AWS users. Even AWS users can add this jar explicitly if desired."

If we use 3.4.1, the fat jar won't be included.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this should be a better solution. I'll test it and give my conclusion later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luoyuxia Hadoop 3.4.1 provides a "tar" file instead of the "jar" file we usually use. Simply adjusting the version number in the pom.xml file will not reduce the final package size.
I think we still need to manually exclude the fat JAR.

<groupId>software.amazon.awssdk</groupId>
<artifactId>bundle</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- Hadoop-aws 3.4.0 depends on AWS SDK V2 (bundled) -->
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>s3-transfer-manager</artifactId>
<version>${aws-java-sdk-v2.version}</version>
</dependency>

<dependency>
<!-- Hadoop requires jaxb-api for javax.xml.bind.JAXBException -->
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.fluss.fs.s3.exception;

/** Exception thrown when no credentials were found. */
public class InvalidCredentialsException extends RuntimeException {

public static final String E_NO_AWS_CREDENTIALS = "No AWS Credentials";

public InvalidCredentialsException(String credentialProvider) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about the class not found issue. I also check the v0.8 jar for s3

com/amazonaws/SdkClientException.class

The same with the jar build with this pr

com/amazonaws/SdkClientException.class

So, it looks to me that the same problem will also happen in v0.8, right?

Copy link
Contributor Author

@sd4324530 sd4324530 Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for reminding me. I checked it again carefully.
Fluss v0.8 depends on hadoop version 3.3.4.
I found that the extends chain of NoAwsCredentialsException has changed when upgrading from 3.3.4 to 3.4.0:

hadoop 3.3.4:

classDiagram
direction RL
    class AmazonClientException
    class CredentialInitializationException
    class NoAuthWithAWSException
    class NoAwsCredentialsException
    
    AmazonClientException --|> CredentialInitializationException
    CredentialInitializationException --|> NoAuthWithAWSException
    NoAuthWithAWSException --|> NoAwsCredentialsException
Loading

hadoop 3.4.0:

classDiagram
direction RL
    class SdkClientException
    class CredentialInitializationException
    class NoAuthWithAWSException
    class NoAwsCredentialsException
    
    SdkClientException--|> CredentialInitializationException
    CredentialInitializationException --|> NoAuthWithAWSException
    NoAuthWithAWSException --|> NoAwsCredentialsException
Loading

In Hadoop 3.3.4, AmazonClientException originates from aws-java-sdk-core-1.12.319.jar, which is only 1MB in size.
see: https://github.com/apache/fluss/blob/release-0.8/fluss-filesystems/fluss-fs-s3/pom.xml#L188-L192
In Hadoop 3.4.0, SdkClientException originates from bundle-2.23.19.jar, which is 500+MB in size.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explantion.

this(credentialProvider, null);
}

public InvalidCredentialsException(String credentialProvider, Throwable throwable) {
super(credentialProvider + ": " + E_NO_AWS_CREDENTIALS, throwable);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,12 @@

package org.apache.fluss.fs.s3.token;

import org.apache.fluss.fs.s3.exception.InvalidCredentialsException;
import org.apache.fluss.fs.token.Credentials;

import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.AWSCredentialsProvider;
import com.amazonaws.auth.BasicSessionCredentials;
import org.apache.hadoop.fs.s3a.auth.NoAwsCredentialsException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

Expand All @@ -46,7 +46,7 @@ public AWSCredentials getCredentials() {
Credentials credentials = S3DelegationTokenReceiver.getCredentials();

if (credentials == null) {
throw new NoAwsCredentialsException(COMPONENT);
throw new InvalidCredentialsException(COMPONENT);
}
LOG.debug("Providing session credentials");
return new BasicSessionCredentials(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,12 @@

package org.apache.fluss.fs.s3.token;

import org.apache.fluss.fs.s3.exception.InvalidCredentialsException;
import org.apache.fluss.fs.token.Credentials;
import org.apache.fluss.fs.token.CredentialsJsonSerde;
import org.apache.fluss.fs.token.ObtainedSecurityToken;
import org.apache.fluss.fs.token.SecurityTokenReceiver;

import org.apache.hadoop.fs.s3a.auth.NoAwsCredentialsException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

Expand Down Expand Up @@ -63,7 +63,7 @@ public static void updateHadoopConfig(org.apache.hadoop.conf.Configuration hadoo
if (additionInfos == null) {
// if addition info is null, it also means we have not received any token,
// we throw InvalidCredentialsException
throw new NoAwsCredentialsException(DynamicTemporaryAWSCredentialsProvider.COMPONENT);
throw new InvalidCredentialsException(DynamicTemporaryAWSCredentialsProvider.COMPONENT);
} else {
for (Map.Entry<String, String> entry : additionInfos.entrySet()) {
hadoopConfig.set(entry.getKey(), entry.getValue());
Expand Down
32 changes: 31 additions & 1 deletion fluss-filesystems/fluss-fs-s3/src/main/resources/META-INF/NOTICE
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ This project bundles the following dependencies under the Apache Software Licens
- io.dropwizard.metrics:metrics-core:3.2.4
- io.netty:netty-buffer:4.1.100.Final
- io.netty:netty-codec:4.1.100.Final
- io.netty:netty-codec-http2:4.1.100.Final
- io.netty:netty-codec-http:4.1.100.Final
- io.netty:netty-common:4.1.100.Final
- io.netty:netty-handler:4.1.100.Final
- io.netty:netty-resolver:4.1.100.Final
Expand All @@ -52,9 +54,37 @@ This project bundles the following dependencies under the Apache Software Licens
- org.apache.kerby:kerby-util:2.0.3
- org.bouncycastle:bcprov-jdk15on:1.70
- org.codehaus.jettison:jettison:1.5.4
- org.reactivestreams:reactive-streams:1.0.4
- org.wildfly.openssl:wildfly-openssl:1.1.3.Final
- org.xerial.snappy:snappy-java:1.1.10.4
- software.amazon.awssdk:bundle:2.23.19
- software.amazon.awssdk:annotations:2.23.19
- software.amazon.awssdk:apache-client:2.23.19
- software.amazon.awssdk:arns:2.23.19
- software.amazon.awssdk:auth:2.23.19
- software.amazon.awssdk:aws-core:2.23.19
- software.amazon.awssdk:aws-query-protocol:2.23.19
- software.amazon.awssdk:aws-xml-protocol:2.23.19
- software.amazon.awssdk:checksums-spi:2.23.19
- software.amazon.awssdk:checksums:2.23.19
- software.amazon.awssdk:crt-core:2.23.19
- software.amazon.awssdk:endpoints-spi:2.23.19
- software.amazon.awssdk:http-auth-aws:2.23.19
- software.amazon.awssdk:http-auth-spi:2.23.19
- software.amazon.awssdk:http-auth:2.23.19
- software.amazon.awssdk:http-client-spi:2.23.19
- software.amazon.awssdk:identity-spi:2.23.19
- software.amazon.awssdk:json-utils:2.23.19
- software.amazon.awssdk:metrics-spi:2.23.19
- software.amazon.awssdk:netty-nio-client:2.23.19
- software.amazon.awssdk:profiles:2.23.19
- software.amazon.awssdk:protocol-core:2.23.19
- software.amazon.awssdk:regions:2.23.19
- software.amazon.awssdk:s3:2.23.19
- software.amazon.awssdk:s3-transfer-manager:2.23.19
- software.amazon.awssdk:sdk-core:2.23.19
- software.amazon.awssdk:third-party-jackson-core:2.23.19
- software.amazon.awssdk:utils:2.23.19
- software.amazon.eventstream:eventstream:1.0.1
- software.amazon.ion:ion-java:1.0.2

This project bundles the following dependencies under BSD-2 License (https://opensource.org/licenses/BSD-2-Clause).
Expand Down