Skip to content

S3 multipart upload unable to recover from server-side failures #5415

@steveloughran

Description

@steveloughran

Describe the bug

This is covered in HADOOP-19221.

S3 multipart upload fails to recover from server-side failures.
It repeats the attempted POST, but does not supply enough data, so the S3 store returns 400
"Your socket connection to the server was not read from or written to within the timeout period"

This failure has been observed in a live system.
We have also been able to replicate it ourselves through fault injection, converting the 200 response into a 500.

Once this is done, all subsequent retries fail with 400 + RequestTimeout responses until the retry limit is reached.

This is a regression from the V1 SDK, whose upload requests supported a filename and offset as well as data size. It was clearly able to repeat a request by seeking to the original offset and streaming up the data again.

The new design, while elegant and flexible, seems unable to cope with failures for a foundational use case "upload part of a file". This is unfortunate.

This is a critical enough failure that we have worked around it ourselves by implementing our own content providers for file sources and others; The fault injection used to replicate the failure allows us to be confident that we have fixed it in our own code.

We have not attempted fault injection through the TransferManager on the basis that there is nothing we can do to fix it. We are simply going to document the error and recommend disabling using it to upload data from the local file system.

Expected Behavior

retrying an upload of a part in a multipart upload to succeed

Current Behavior

After the first failure is returned (500 etc) all retries fail with 400 error and a message that not enough data was supplied.

org.apache.hadoop.fs.s3a.AWSBadRequestException: upload part #1 upload ID KylDNXbmiZlZE5JI2aKVcVeA66lyRaUZWmKv5K3QfGcjgqjpMAsnbRkJRug8jf3kdjXuL_zVt09813gtOMS9CSxwTDbfGR4xOLPUmZZ9suEJGh.iolwJgMnSXk5KFxnG on test/testCommitOperations[array-wrap-true]: software.amazon.awssdk.services.s3.model.S3Exception: Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed. (Service: S3, Status Code: 400, Request ID: TT17CRYF6HJH2G0Y, Extended Request ID: d7VKx+iXtShSdp/uVQhrDAhC7dOGbZiZE6ul1I/k19RRDc6wRHyKSkBJU82ta22ns+J7QbbL5XY=):RequestTimeout: Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed. (Service: S3, Status Code: 400, Request ID: TT17CRYF6HJH2G0Y, Extended Request ID: d7VKx+iXtShSdp/uVQhrDAhC7dOGbZiZE6ul1I/k19RRDc6wRHyKSkBJU82ta22ns+J7QbbL5XY=)
at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:260)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124)
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347)
at org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:205)
at org.apache.hadoop.fs.s3a.WriteOperationHelper.uploadPart(WriteOperationHelper.java:589)
at org.apache.hadoop.fs.s3a.commit.impl.CommitOperations.uploadFileDataAsInputStream(CommitOperations.java:686)
at org.apache.hadoop.fs.s3a.commit.impl.CommitOperations.uploadFileToPendingCommit(CommitOperations.java:586)
at org.apache.hadoop.fs.s3a.commit.magic.ITestUploadRecovery.testCommitOperations(ITestUploadRecovery.java:280)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:750)
Caused by: software.amazon.awssdk.services.s3.model.S3Exception: Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed. (Service: S3, Status Code: 400, Request ID: TT17CRYF6HJH2G0Y, Extended Request ID: d7VKx+iXtShSdp/uVQhrDAhC7dOGbZiZE6ul1I/k19RRDc6wRHyKSkBJU82ta22ns+J7QbbL5XY=)
at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43)
at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:93)
at software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$7(BaseClientHandler.java:279)
at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:50)
at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:38)
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:72)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:42)
at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:78)
at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:40)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:55)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:39)
at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:81)
at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:36)
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56)
at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32)
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:224)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:173)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:80)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:182)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:74)
at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45)
at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53)
at software.amazon.awssdk.services.s3.DefaultS3Client.uploadPart(DefaultS3Client.java:11515)
at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$uploadPart$16(S3AFileSystem.java:3345)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDurationOfSupplier(IOStatisticsBinding.java:651)
at org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:3342)
at org.apache.hadoop.fs.s3a.WriteOperationHelper.lambda$uploadPart$9(WriteOperationHelper.java:594)
at org.apache.hadoop.fs.store.audit.AuditingFunctions.lambda$withinAuditSpan$0(AuditingFunctions.java:62)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122)
... 24 moreSuppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 1 failure: null (Service: S3, Status Code: 500, Request ID: TZ0558DJES2Z0HF3, Extended Request ID: NZRAAV8C5lcYmt1knZyj3jQbUxvEQFgOdOI+Od/xwx+jZVTLOTZyXb87acVfI6SoE5IybofBiDY=)
Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 2 failure: Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed. (Service: S3, Status Code: 400, Request ID: FRGQYV48RNTQDADX, Extended Request ID: r7drlnxjjdIbZCfmCg5i5Y/a622hxSMO+JK7idhL49zJOvLvCwwDUPxbgW8rbxAWTvoN76cCnp4=)

Reproduction Steps

Steps to replicate

  • Supply the request body from an existing input stream, via RequestBody.fromInputStream(stream, size)
  • submit a post request such that the request body is successfully uploaded
  • have the far end return a 500 response, which triggers a retry within the sdk.
  • wait about 30 seconds.

Fraction of the code which generates this

InputStream fileStream = Files.newInputStream(localFile.toPath());
...
UploadPartRequest part = createUploadRequest()build();
RequestBody body = RequestBody.fromInputStream(fileStream, size);
UploadPartResponse response = s3Client.uploadPart(part, body);

Possible Solution

Implement a ContentProvider which can open a file at a given offset, supplying a specified number of bytes. On a retry, repeat this operation. This is what we have done -restoring the behaviour of the V1 sdk.

There is clearly also something failing silently in the SDK on its second and subsequent attempts. The client does not raise any error about having an insufficient quantity of data, it's mainly reattempts the post but does not supply enough bytes compared to the content length header.

Additional Information/Context

No response

AWS Java SDK version used

2.25.53

JDK version used

openjdk version "1.8.0_362" OpenJDK Runtime Environment (Zulu 8.68.0.21-CA-macos-aarch64) (build 1.8.0_362-b09) OpenJDK 64-Bit Server VM (Zulu 8.68.0.21-CA-macos-aarch64) (build 25.362-b09, mixed mode)

Operating System and version

macos 23.5.0 Darwin Kernel Version 23.5.0: Wed May 1 20:12:58 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6000 arm64

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThis issue is a bug.needs-triageThis issue or PR still needs to be triaged.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions