Skip to content

AWS CRT sync client throws CancellationException when API timeouts configured #4820

@statelessness

Description

@statelessness

Describe the bug

Recently I've upgraded my project to use AWS CRT sync client. I've configured api timeouts too. If the api call takes longer than the configured api timeout, a CancellationException is thrown. More specifically, this is happening when trying to obtaing a console embedding url from Quicksight.
By the way, I don't know if it's relevant, but the lambda is using Snapstart.

Expected Behavior

I wass expecting the QuickSight client to retry the api call.

Current Behavior

[2024-01-09 09:56:07.072] <1eb30497-0a63-4843-aa9d-e0c5eefb70f8> TRACE s.a.a.c.i.ExecutionInterceptorChain - Old: DefaultSdkHttpFullRequest(httpMethod=POST, protocol=https, host=quicksight.us-east-1.amazonaws.com, encodedPath=/accounts/292326452532/embed-url/registered-user, headers=[Content-Length, Content-Type], queryParameters=[])
New: DefaultSdkHttpFullRequest(httpMethod=POST, protocol=https, host=quicksight.us-east-1.amazonaws.com, encodedPath=/accounts/292326452532/embed-url/registered-user, headers=[Content-Length, Content-Type, X-Amzn-Trace-Id], queryParameters=[])

[2024-01-09 09:56:07.072] <1eb30497-0a63-4843-aa9d-e0c5eefb70f8> DEBUG s.a.a.c.i.ExecutionInterceptorChain - Interceptor 'software.amazon.awssdk.services.quicksight.endpoints.internal.QuickSightRequestSetEndpointInterceptor@6d91790b' modified the message with its modifyHttpRequest method.

[2024-01-09 09:56:07.072] <1eb30497-0a63-4843-aa9d-e0c5eefb70f8> TRACE s.a.a.c.i.ExecutionInterceptorChain - Old: DefaultSdkHttpFullRequest(httpMethod=POST, protocol=https, host=quicksight.us-east-1.amazonaws.com, encodedPath=/accounts/292326452532/embed-url/registered-user, headers=[Content-Length, Content-Type, X-Amzn-Trace-Id], queryParameters=[])
New: DefaultSdkHttpFullRequest(httpMethod=POST, protocol=https, host=quicksight.us-east-1.amazonaws.com, encodedPath=/accounts/292326452532/embed-url/registered-user, headers=[Content-Length, Content-Type, X-Amzn-Trace-Id], queryParameters=[])

[2024-01-09 09:56:07.073] <1eb30497-0a63-4843-aa9d-e0c5eefb70f8> DEBUG s.a.a.request - Sending Request: DefaultSdkHttpFullRequest(httpMethod=POST, protocol=https, host=quicksight.us-east-1.amazonaws.com, encodedPath=/accounts/292326452532/embed-url/registered-user, headers=[amz-sdk-invocation-id, Content-Length, Content-Type, User-Agent, X-Amzn-Trace-Id], queryParameters=[])

[2024-01-09 09:56:07.087] <1eb30497-0a63-4843-aa9d-e0c5eefb70f8> DEBUG s.a.a.c.i.i.SdkLengthAwareInputStream - Specified InputStream length of 341 has been reached. Returning EOF.

[2024-01-09 09:56:07.088] <1eb30497-0a63-4843-aa9d-e0c5eefb70f8> DEBUG s.a.a.a.s.Aws4Signer - AWS4 Canonical Request: POST
/accounts/292326452532/embed-url/registered-user

amz-sdk-invocation-id:822486e8-338b-173b-e148-761ee8ce818f
amz-sdk-request:attempt=1; max=3
content-length:341
content-type:application/json
host:quicksight.us-east-1.amazonaws.com
x-amz-date:20240109T095607Z
x-amz-security-token:IQoJb3JpZ2luX2VjEFIaCXVzLWVhc3QtMSJHMEUCIGvLdom4dsCU5KPOf4GQIkfLchqwEh9/I5nojebzjnWmAiEA7ShO0Jl/zJVPRsCskCvYteN2pQqgUcyy1opn81HV6hgqtAMI6///////////ARAAGgwyOTIzMjY0NTI1MzIiDApuwv3g81uqXUvAoCqIA/SAos+tP9a/xGjzN/yW2OalYirdXVnB39xrKuC0mkxO/mnFEtFluBZLNSkxf2w+Yv0cbLmGyfNnzRy716EANOeKHYVVnfbYVZsBSjJblTG9znTLx09MnoM9ZDY/LS9XHKQ9xGgCzXheYhmQsiS8JCNdr+qnt5WfWe/9T+LbHYrP4osgJxFdtksMjWWpvp0nm16DyJPS0m9L3tjOczqXeGF41Ghb6Oy9TBf7guPD7F7PMHDfiwZhG0Kc/zKmIeI08/D1bQNxqR7lMHU7Tip8q8zPECXUwCQo0y7Yud7DNNepvySEO49w0YWRw95IUqStu/7hqEicriIf3+k3WwYsiwqmtvK9YdJxJbp1B80B4KA//jnhdllQLFNyynorOKPGtN6XmBrL7DbZaWOskwePTqbrCwYsOrPaGoZ7S7LOT5shqMi5RrFoSXXFrQ5zlcCR2V5k2Nd9ItGxyv9y47QLFfxznULc8FB4RIKkzkA9RrRLjNp5DKLt5ZiN5/g/Q7emtQVgYiF716OgMLSw9KwGOp0BoOGA4xAp0ekPmwxJ5tw89lqvbBhkaowqvqtVtdkhafM5kVXW4gZhtVgAi0n4BgnNf9IvYiQwjrUEe4G/ehm275N1ZSPhHCM19JvadVkY0eikc6mlhVZ9F+3qniR2sOCl2k23U8kDivJwS9nAKhFCsx5mR6p+BmKTK8VABWtJZl9Oewsfw76/9jphNvJ8J6vODhW9xRaM0tTTMaj4tA==

amz-sdk-invocation-id;amz-sdk-request;content-length;content-type;host;x-amz-date;x-amz-security-token
5cac061697558a0f46aa01317090dbbe64a0b1748701040bb0d73a7b6732d831

[2024-01-09 09:56:07.088] <1eb30497-0a63-4843-aa9d-e0c5eefb70f8> DEBUG s.a.a.a.s.Aws4Signer - AWS4 String to sign: AWS4-HMAC-SHA256
20240109T095607Z
20240109/us-east-1/quicksight/aws4_request
bdd745f378b13fe8ef430c629315aa0c4968340c5a76bd572395a6f103d4b440

[2024-01-09 09:56:07.088] <1eb30497-0a63-4843-aa9d-e0c5eefb70f8> TRACE s.a.a.a.s.Aws4Signer - Generating a new signing key as the signing key not available in the cache for the date: 1704794167087

[2024-01-09 09:56:08.578] <1eb30497-0a63-4843-aa9d-e0c5eefb70f8> DEBUG n.g.e.ExecutionStrategy - '1f23ea26-63e7-49ed-bf18-2097a2213d77', field '/consoleEmbeddingUrl' fetch threw exception
java.util.concurrent.CancellationException: null
at java.base/java.util.concurrent.CompletableFuture.cancel(Unknown Source)
at software.amazon.awssdk.http.crt.AwsCrtHttpClient$CrtHttpRequest.abort(AwsCrtHttpClient.java:139)
at software.amazon.awssdk.core.internal.http.timers.SyncTimeoutTask.run(SyncTimeoutTask.java:63)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)

Reproduction Steps

public String consoleEmbeddingUrl(final String userName, final Long tokenExpirationEpoch) {
    logger.debug("Generating console embedding url for username {}...", userName);
    final RegisteredUserEmbeddingExperienceConfiguration experienceConfiguration = RegisteredUserEmbeddingExperienceConfiguration.builder()
            .quickSightConsole(b -> b.initialPath(QUICKSIGHT_INITIAL_PATH)).build();
    final GenerateEmbedUrlForRegisteredUserRequest request = GenerateEmbedUrlForRegisteredUserRequest.builder()
            .allowedDomains(getAllowedDomains()).awsAccountId(awsAccountId).experienceConfiguration(experienceConfiguration)
            .sessionLifetimeInMinutes(QUICKSIGHT_MAX_SESSION_MINUTES).userArn(getUserArn(userName)).build();
    logger.trace("GenerateEmbedUrlForRegisteredUserRequest is {}", request);
    final GenerateEmbedUrlForRegisteredUserResponse response = quickSightClient.generateEmbedUrlForRegisteredUser(request);
    logger.debug("Generated console embedding url is {}", response.embedUrl());
    return response.embedUrl();

Possible Solution

I think this exception should be handled by the client and produce a new retry.

Additional Information/Context

My configuration for the QuickSight and HTTP clients is the following (part of a Dagger module)

private static final Duration DEFAULT_CONNECTION_TIMEOUT = Duration.ofSeconds(2);
private static final Duration DEFAULT_CONNECTION_MAX_IDLE_TIMEOUT = Duration.ofSeconds(60);
private static final int DEFAULT_MAX_CONCURRENCY = 100;
private static final long DEFAULT_MINIMUM_THROUGHPUT_IN_BPS = 32000L;
private static final Duration DEFAULT_MINIMUM_THROUGHPUT_TIMEOUT = Duration.ofSeconds(3);
private static final Duration DEFAULT_API_CALL_TIMEOUT = Duration.ofSeconds(5);
private static final Duration DEFAULT_API_CALL_ATTEMPT_TIMEOUT = Duration.ofMillis(1500);
private static final ClientOverrideConfiguration DEFAULT_CLIENT_OVERRIDE_CONFIGURATION =
        ClientOverrideConfiguration.builder()
                .apiCallAttemptTimeout(DEFAULT_API_CALL_ATTEMPT_TIMEOUT)
                .apiCallTimeout(DEFAULT_API_CALL_TIMEOUT)
                .build();

@Provides
@Singleton
static QuickSightClient provideQuickSightClient(final ServiceConfiguration serviceConfiguration,
                                                final SdkHttpClient sdkHttpClient) {
    return QuickSightClient.builder()
            .defaultsMode(DefaultsMode.STANDARD)
            .httpClient(sdkHttpClient)
            .overrideConfiguration(DEFAULT_CLIENT_OVERRIDE_CONFIGURATION)
            .region(Region.of(serviceConfiguration.getAwsRegion()))
            .build();
}

@Provides
@Singleton
static SdkHttpClient provideSdkHttpClient() {
    return AwsCrtHttpClient.builder()
            .maxConcurrency(DEFAULT_MAX_CONCURRENCY)
            .connectionTimeout(DEFAULT_CONNECTION_TIMEOUT)
            .connectionHealthConfiguration(builder -> builder
                    .minimumThroughputInBps(DEFAULT_MINIMUM_THROUGHPUT_IN_BPS)
                    .minimumThroughputTimeout(DEFAULT_MINIMUM_THROUGHPUT_TIMEOUT))
            .connectionMaxIdleTime(DEFAULT_CONNECTION_MAX_IDLE_TIMEOUT)
            .build();
}

AWS Java SDK version used

2.22.9

JDK version used

Java11

Operating System and version

Lambda with Java11 runtime

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThis issue is a bug.p2This is a standard priority issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions