Skip to content

STSAssumeRoleWebIdentityCredentialsProvider doesn't work in a child process after forking #3749

@G-D-Petrov

Description

@G-D-Petrov

Describe the bug

After version 1.11.646, if the AWS SDK is initialized a parent process and then forked, the STSAssumeRoleWebIdentityCredentialsProvider no longer works correctly and fails to authenticate.
A work around would be to reinitialize the AWS SDK API but this can lead to crashes, see issue #2119

This used to work correctly with version 1.11.591.

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

The child process should have access AWS

Current Behavior

The STSAssumeRoleWebIdentityCredentialsProvider is failing to connect and the auth is failing.

Reproduction Steps

#include <aws/core/Aws.h>
#include <aws/core/auth/STSCredentialsProvider.h>
#include <aws/s3/S3Client.h>
#include <aws/s3/model/ListBucketsRequest.h>

#include <cstdio>
#include <cstdlib>
#include <csignal>
#include <sys/wait.h>
#include <unistd.h>

static void alarm_handler(int) {
    fprintf(stderr, "  CHILD: timed out (hung) — exiting with code 99\n");
    _exit(99);
}

// Fork and run `fn` in the child. Parent waits and reports outcome.
static void run_in_child(const char* label, void (*fn)(Aws::S3::S3Client*), Aws::S3::S3Client* client) {
    printf("\n=== Test: %s ===\n", label);
    fflush(stdout);

    pid_t pid = fork();
    if (pid < 0) {
        perror("fork");
        return;
    }

    if (pid == 0) {
        // Child — set a 10s alarm to detect hangs
        signal(SIGALRM, alarm_handler);
        alarm(10);

        fn(client);

        _exit(0);
    }

    // Parent — wait for child
    int status = 0;
    waitpid(pid, &status, 0);

    if (WIFSIGNALED(status)) {
        printf("  RESULT: child killed by signal %d (%s)\n",
               WTERMSIG(status), strsignal(WTERMSIG(status)));
    } else if (WIFEXITED(status)) {
        int code = WEXITSTATUS(status);
        if (code == 0)
            printf("  RESULT: child exited normally (no crash — but check output above)\n");
        else if (code == 99)
            printf("  RESULT: child hung (killed by alarm)\n");
        else
            printf("  RESULT: child exited with code %d\n", code);
    }
}


// This is the specific provider that hangs after fork: it uses an internal
// AWSHttpResourceClient (CRT-based) that inherits dead thread/connection state
// from the parent. With IRSA env vars set, the constructor attempts an HTTP
// call to the STS endpoint using this broken HTTP client and hangs indefinitely.
static void test_sts_web_identity_provider(Aws::S3::S3Client*) {
    // Set IRSA env vars so the provider actually activates.
    // The token file doesn't need to exist — the hang occurs during the HTTP
    // client setup, before the token file is even read.
    setenv("AWS_WEB_IDENTITY_TOKEN_FILE", "/tmp/nonexistent_token_for_fork_test", 1);
    setenv("AWS_ROLE_ARN", "arn:aws:iam::123456789012:role/test-role", 1);
    setenv("AWS_ROLE_SESSION_NAME", "fork-test-session", 1);

    fprintf(stderr, "  CHILD: creating STSAssumeRoleWebIdentityCredentialsProvider after fork...\n");
    Aws::Auth::STSAssumeRoleWebIdentityCredentialsProvider provider;

    fprintf(stderr, "  CHILD: calling GetAWSCredentials()...\n");
    auto creds = provider.GetAWSCredentials();
    fprintf(stderr, "  CHILD: GetAWSCredentials() returned (key=%s)\n",
            creds.GetAWSAccessKeyId().empty() ? "<empty>" : "<non-empty>");
}

int main() {
    printf("AWS SDK C++ fork-safety reproduction\n");
    printf("PID: %d\n\n", getpid());

    // Initialize the SDK in the parent
    Aws::SDKOptions options;
    Aws::InitAPI(options);

    printf("SDK initialized. Creating S3Client in parent...\n");

    Aws::Client::ClientConfiguration config;
    config.region = "us-east-1";
    config.connectTimeoutMs = 1000;
    config.requestTimeoutMs = 1000;
    auto client = std::make_shared<Aws::S3::S3Client>(config);

    printf("Parent S3Client created. Running fork tests...\n");

    // Run each test scenario in a separate forked child
    run_in_child("STSAssumeRoleWebIdentityCredentialsProvider after fork", test_sts_web_identity_provider, nullptr);

    printf("\n=== All tests complete ===\n");
    printf("If any child crashed or hung, the SDK is not fork-safe.\n");

    client.reset();
    Aws::ShutdownAPI(options);

    return 0;
}

Possible Solution

No response

Additional Information/Context

No response

AWS CPP SDK version used

1.11.646

Compiler and Version used

gcc (Ubuntu 13.3.0-6ubuntu2~24.04.1) 13.3.0

Operating System and version

Ubuntu 24.04

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThis issue is a bug.response-requestedWaiting on additional info and feedback. Will move to "closing-soon" in 10 days.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions