-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Description
Hi there,
I've been in touch with AWS support, and they encouraged me to post my findings here. We're struggling with an application where an AmazonS3Client
instance suddenly starts returning ClientExecutionTimeoutException
.
Our application polls a path in S3 for changes every 100ms. At each interval, it lists objects ( listObjectsV2
), gets them (getObject
) and deletes them (deleteObject
). It is a long-lived application that should be able to run for weeks and months.
I have the relevant source code in this gist: https://gist.github.com/hawkaa/31335a77e6b2dae4f828225940d78a59
Every now and then, maybe every 5-10 days, the client suddenly starts returning ClientExecutionTimeoutException
after the configured timeout. It does so for all succeeding requests. The requests are executed in serial, thus it slows down the number of requests from 10 each second into one request every 25 seconds (which is our timeout). The stack trace looks like this:
03:17:23.506 [pool-2-thread-7] ERROR c.s.s.d.dataplatform.probe.Scheduler - Uncaught exception
com.amazonaws.http.timers.client.ClientExecutionTimeoutException: Client execution did not complete before the specified timeout configuration.
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleInterruptedException(AmazonHttpClient.java:788)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:665)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:647)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:511)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4221)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4168)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4162)
at com.amazonaws.services.s3.AmazonS3Client.listObjects(AmazonS3Client.java:821)
Restarting the application immediately solves the problem, indicating there's a problem in the code and not with S3. My main suspect is the AmazonHttpClient
. It seems that it ends up in a deadlock state, or have some full buffers, causing new requests to time out.
It could also happen that we do not clean up the response objects properly, filling up some buffers, but since the application successfully completes 8-9M requests (which is what we have in ~10 days), I find it unlikely.
Our planned fix is to periodically rebuild the AmazonS3Client
instance, but we haven't put this code into production yet.
Could it be a bug in the SDK, you guys think? Let me know if you have more questions.
Håkon