Skip to content

Intermittent issue on sdk v2 - Unable to load credentials from service endpoint #3448

Closed
@striker50

Description

@striker50

Describe the bug

I am running a Spark application on an EMR 5.30.1 cluster. Seeing intermittent credential access issue after upgrading AWS SDK version from 1.11.297 to 2.17.11.
We upgraded from v1 to v2 as v1 doesn't have stable support for configuring custom VPC endpoints - As recommended here aws/aws-sdk-java#2135 (comment)

On moving to sdk v2, the VPC endpoint access is working fine but we are seeing INTERMITTENT SQS sendMessage() failures because of credential access issue due to connection timed out.
Following https://docs.amazonaws.cn/en_us/sdk-for-java/latest/developer-guide/migration-client-credentials.html I also enabled async credential refresher using below code during SQS client initialization. But the issue still occurs

Please advise on how to fix the intermittent credential access issue on aws sdk v2. We get the credentials from InstanceProfileCredentialsProvider.

Expected Behavior

Consistent credential access for the SQSClient when sending messages

Current Behavior

code snippet:

URI endpointURI = new URI(sqsVPCEndpoint);
InstanceProfileCredentialsProvider provider = InstanceProfileCredentialsProvider.builder()
.asyncCredentialUpdateEnabled(true)
.build();
SqsClient sqs = SqsClient.builder().region(Region.of(awsRegion)).endpointOverride(endpointURI).
credentialsProvider(provider).build();

Error message:
java.util.concurrent.ExecutionException: software.amazon.awssdk.core.exception.SdkClientException: Unable to load credentials from service endpoint. at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) ...... at java.lang.Thread.run(Thread.java:750) Caused by: software.amazon.awssdk.core.exception.SdkClientException: Unable to load credentials from service endpoint. at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:98) at software.amazon.awssdk.auth.credentials.HttpCredentialsProvider.refreshCredentials(HttpCredentialsProvider.java:110) at software.amazon.awssdk.utils.cache.CachedSupplier.refreshCache(CachedSupplier.java:132) at software.amazon.awssdk.utils.cache.CachedSupplier.get(CachedSupplier.java:89) at java.util.Optional.map(Optional.java:215) at software.amazon.awssdk.auth.credentials.HttpCredentialsProvider.resolveCredentials(HttpCredentialsProvider.java:146) at software.amazon.awssdk.awscore.client.handler.AwsClientHandlerUtils.createExecutionContext(AwsClientHandlerUtils.java:79) at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.createExecutionContext(AwsSyncClientHandler.java:68) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:99) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:169) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:95) at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45) at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:55) at software.amazon.awssdk.services.sqs.DefaultSqsClient.sendMessage(DefaultSqsClient.java:1528) ....... ....... Caused by: java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:607) at sun.net.NetworkClient.doConnect(NetworkClient.java:175) at sun.net.www.http.HttpClient.openServer(HttpClient.java:463) at sun.net.www.http.HttpClient.openServer(HttpClient.java:558) at sun.net.www.http.HttpClient.<init>(HttpClient.java:242) at sun.net.www.http.HttpClient.New(HttpClient.java:339) at sun.net.www.http.HttpClient.New(HttpClient.java:357) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1228) at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1207) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056) at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990) at software.amazon.awssdk.regions.internal.util.ConnectionUtils.connectToEndpoint(ConnectionUtils.java:45) at software.amazon.awssdk.regions.util.HttpResourcesUtils.readResource(HttpResourcesUtils.java:112) at software.amazon.awssdk.regions.util.HttpResourcesUtils.readResource(HttpResourcesUtils.java:91) at software.amazon.awssdk.auth.credentials.HttpCredentialsProvider.refreshCredentials(HttpCredentialsProvider.java:79) ... 21 more

Reproduction Steps

code snippet for AWS SDK v2:

URI endpointURI = new URI(sqsVPCEndpoint);
InstanceProfileCredentialsProvider provider = InstanceProfileCredentialsProvider.builder()
.asyncCredentialUpdateEnabled(true)
.build();
SqsClient sqs = SqsClient.builder().region(Region.of(awsRegion)).endpointOverride(endpointURI).
credentialsProvider(provider).build();

Possible Solution

No response

Additional Information/Context

No response

AWS Java SDK version used

2.17.11

JDK version used

1.8

Operating System and version

EMR clusters

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions