Skip to content

Fail to retrieve token for high latency connection #2365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
craigsmithmsp opened this issue Jun 17, 2020 · 9 comments
Closed

Fail to retrieve token for high latency connection #2365

craigsmithmsp opened this issue Jun 17, 2020 · 9 comments
Labels
feature-request A feature should be added or improved.

Comments

@craigsmithmsp
Copy link

If I connect using a high latency, satellite connection, the AWS SDK cannot retrieve a token. This problem started with 1.11.678. I have not found a configuration to increase the timeout for the underlying operation. Can one be added?

In my case, I have a simple Spring Boot application, using Spring Cloud, with AWS SQS. By default, that pulls in the 1.11.415 version. We had trouble with connections not getting properly closed and needed to upgrade AWS to prevent an open files leak. Although this was fixed, it introduced the token retrieval issue.

Stack trace

2020-06-17 09:08:22.518 level=WARN thread="pool-1-thread-16" c.a.i.InstanceMetadataServiceResourceFetcher - Fail to retrieve token
com.amazonaws.SdkClientException: Failed to connect to service endpoint:
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:100)
at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.getToken(InstanceMetadataServiceResourceFetcher.java:91)
at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.readResource(InstanceMetadataServiceResourceFetcher.java:69)
at com.amazonaws.internal.EC2ResourceFetcher.readResource(EC2ResourceFetcher.java:66)
at com.amazonaws.auth.InstanceMetadataServiceCredentialsFetcher.getCredentialsEndpoint(InstanceMetadataServiceCredentialsFetcher.java:58)
at com.amazonaws.auth.InstanceMetadataServiceCredentialsFetcher.getCredentialsResponse(InstanceMetadataServiceCredentialsFetcher.java:46)
at com.amazonaws.auth.BaseCredentialsFetcher.fetchCredentials(BaseCredentialsFetcher.java:112)
at com.amazonaws.auth.BaseCredentialsFetcher.getCredentials(BaseCredentialsFetcher.java:68)
at com.amazonaws.auth.InstanceProfileCredentialsProvider.getCredentials(InstanceProfileCredentialsProvider.java:166)
at com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper.getCredentials(EC2ContainerCredentialsProviderWrapper.java:75)
at com.amazonaws.auth.AWSCredentialsProviderChain.getCredentials(AWSCredentialsProviderChain.java:117)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1225)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1246)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1113)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:770)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:744)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:726)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:686)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:668)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:532)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:512)
at com.amazonaws.services.sqs.AmazonSQSClient.doInvoke(AmazonSQSClient.java:2207)
at com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:2174)
at com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:2163)
at com.amazonaws.services.sqs.AmazonSQSClient.executeReceiveMessage(AmazonSQSClient.java:1607)
at com.amazonaws.services.sqs.AmazonSQSAsyncClient$14.call(AmazonSQSAsyncClient.java:1055)
at com.amazonaws.services.sqs.AmazonSQSAsyncClient$14.call(AmazonSQSAsyncClient.java:1049)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketException: Network is unreachable: connect
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.http.HttpClient.(HttpClient.java:242)
at sun.net.www.http.HttpClient.New(HttpClient.java:339)
at sun.net.www.http.HttpClient.New(HttpClient.java:357)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1220)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1199)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1050)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:984)
at com.amazonaws.internal.ConnectionUtils.connectToEndpoint(ConnectionUtils.java:52)
at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:80)
... 30 common frames omitted

Environment

  • AWS Java SDK version used: 1.11.791
  • JDK version used: 1.8
  • Operating System and version: Windows 10.0.18363
@craigsmithmsp craigsmithmsp added guidance Question that needs advice or information. needs-triage This issue or PR still needs to be triaged. labels Jun 17, 2020
@debora-ito
Copy link
Member

Hi @craigsmithmsp the problem started in 1.11.678 because it's when the new Instance Metadata Service v2 was released, we have seen various reports of increased latency on the service side (like #2276 and aws/aws-sdk-java-v2#1667).

Unfortunately is not possible to change the underlying connectionTimeout, I can mark this as a feature request if you'd like.
You can also try to add a custom retry logic since the SDK won't retry IMDS credentials fetching.

@debora-ito debora-ito added response-requested Waiting on additional info or feedback. Will move to "closing-soon" in 5 days. service-api This issue is due to a problem in a service API, not the SDK implementation. and removed needs-triage This issue or PR still needs to be triaged. labels Jun 20, 2020
@craigsmithmsp
Copy link
Author

Thank you, @debora-ito . Please mark it as a feature request. The Spring cloud does repeatedly retry without success. I have noticed on our EC2 instances that we sometimes get it on startup but it retries and resolves quite reliably.

@github-actions github-actions bot removed the response-requested Waiting on additional info or feedback. Will move to "closing-soon" in 5 days. label Jun 20, 2020
@debora-ito debora-ito added feature-request A feature should be added or improved. and removed guidance Question that needs advice or information. service-api This issue is due to a problem in a service API, not the SDK implementation. labels Jun 22, 2020
@ahoehma
Copy link

ahoehma commented Jul 7, 2020

HI all, I facing the same logging right now after updating aws-sdk to 1.11.807.
Just for my understanding ... it's not a real problem right?
Because I have a running local springboot-service which is fetching data from s3 and it works ... even if I see this logging.
Would be ok to reduce the loglevel to ERROR?

@rehevkor5
Copy link

It appears that ConnectionUtils has hard-coded connect & read timeouts set to 1s: https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-core/src/main/java/com/amazonaws/internal/ConnectionUtils.java#L41

The Python SDK appears to obey an environment variable AWS_METADATA_SERVICE_TIMEOUT https://boto3.amazonaws.com/v1/documentation/api/1.9.42/guide/configuration.html#environment-variable-configuration but the Java SDK doesn't appear to have anything like that.

@ffeltrinelli
Copy link

Hi! I talked about this issue and described our custom solution in this article.

@ghost
Copy link

ghost commented Nov 27, 2022

Any update on this issue? Facing the same Problem.

@sparrc
Copy link

sparrc commented Mar 30, 2023

Unfortunately is not possible to change the underlying connectionTimeout, I can mark this as a feature request if you'd like.
You can also try to add a custom retry logic since the SDK won't retry IMDS credentials fetching.

Can this issue be closed? From looking at the latest code it looks like the java SDK now reads AWS_METADATA_SERVICE_TIMEOUT since 1.12.40, so the timeout is now configurable:

private static int readTimeoutMillisConfiguration() {
String stringTimeout = System.getenv(SDKGlobalConfiguration.AWS_METADATA_SERVICE_TIMEOUT_ENV_VAR);
if (StringUtils.isNullOrEmpty(stringTimeout)) {
return DEFAULT_TIMEOUT_MILLIS;
}
// To match the CLI behavior, we need to support both integers and doubles. We try int first so that we can get exact
// values, and fall back to double if it doesn't seem to be an int.
try {
int timeoutSeconds = Integer.parseInt(stringTimeout);
return timeoutSeconds * 1000;
} catch (NumberFormatException e) {
try {
double timeoutSeconds = Double.parseDouble(stringTimeout);
return toIntExact(Math.round(timeoutSeconds * 1000));
} catch (NumberFormatException ignored) {
throw new IllegalStateException(SDKGlobalConfiguration.AWS_METADATA_SERVICE_TIMEOUT_ENV_VAR + " environment "
+ "variable value does not appear to be an integer or a double: " +
stringTimeout);
}
}
}

@debora-ito
Copy link
Member

Yes, thank you @sparrc.

The environment variable AWS_METADATA_SERVICE_TIMEOUT is available in v1.

Note that in normal conditions (not in a high latency connection, for example), you should not need to increase this timeout. If the IMDSv2 credential fetching is timing out constantly, it may be a symptom of a network problem, you should identify the actual root cause for the timeouts.

Closing this.

Copy link

This issue is now closed.

Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved.
Projects
None yet
Development

No branches or pull requests

6 participants