Skip to content

Bug fix for S3AsyncClient.putObject hangs if there is a connection reset while uploading of objects #3522

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

joviegas
Copy link
Contributor

@joviegas joviegas commented Oct 27, 2022

Motivation and Context

  • Bug fix for S3AsyncClient.putObject hangs if there is a connection reset while uploading of objects

Root Cause

  • PR# 1481 added logic to skip RunError steps in ResponseHandler.#notifyIfResponseNotCompleted.
  • In the case of PutObject call, the putObject request sends Http100Continue
  • When Http100Continue is sent HttpResponseStatus CONTINUE = newStatus(100, "Continue") is received , which actually ignores the body, sends the lastByte and immediately starts reading the request stream.
  • If there was a connection reset because of PR# 1481 it only checked that lastByte was set and thus skipped the Run Error steps.
  • Since it skipped RunError the error was not propagated and it got stuck.

Modifications

  • Added a new Flag to indicate the HttpResponseStatus.CONTINUE(HTTP 100 Status code) is recieved.
  • This flag is used in combination to lastByte flag such that We need to report error even if its last byte and if it was part of 100 status code response.

Testing

  • Tested by sending a huge object in S3Asycn.putObject and then disconnecting from VPN in between transfer. Expected outcome is it throws TimeOut Exceptions rather than waiting infinitely.
  • Added J units for this case by mocking server disconnects when transfer is in progress.

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Checklist

  • I have read the CONTRIBUTING document
  • Local run of mvn install succeeds
  • My code follows the code style of this project
  • My change requires a change to the Javadoc documentation
  • I have updated the Javadoc documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed
  • I have added a changelog entry. Adding a new entry must be accomplished by running the scripts/new-change script and following the instructions. Commit the new file created by the script in .changes/next-release with your changes.
  • My change is to implement 1.11 parity feature and I have updated LaunchChangelog

License

  • I confirm that this pull request can be released under the Apache 2 license

@joviegas joviegas requested a review from a team as a code owner October 27, 2022 14:16
@sonarqubecloud
Copy link

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 2 Code Smells

74.1% 74.1% Coverage
0.0% 0.0% Duplication

IOException err = new IOException(NettyUtils.closedChannelMessage(handlerCtx.channel()));
runAndLogError(handlerCtx.channel(), () -> "Fail to execute SdkAsyncHttpResponseHandler#onError",
() -> requestCtx.handler().onError(err));
executeFuture(handlerCtx).completeExceptionally(err);
runAndLogError(handlerCtx.channel(), () -> "Could not release channel", () -> closeAndRelease(handlerCtx));
} else if (!Boolean.TRUE.equals(responseCompleted)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is a bit confusing. Does it make sense to rearrange it with !Boolean.TRUE.equals(responseCompleted) in an outer if clause and then just a simple if/else checking isLastByteWithout100Response nested within?

if (!Boolean.TRUE.equals(responseCompleted)) {
    if (!isLastByteWithout100Continue) {
            IOException err = new IOException(NettyUtils.closedChannelMessage(handlerCtx.channel()));
            runAndLogError(handlerCtx.channel(), () -> "Fail to execute SdkAsyncHttpResponseHandler#onError",
                           () -> requestCtx.handler().onError(err));
            executeFuture(handlerCtx).completeExceptionally(err);
            runAndLogError(handlerCtx.channel(), () -> "Could not release channel", () -> closeAndRelease(handlerCtx));
        } else {
            log.trace(handlerCtx.channel(),
                      () -> "Run error skipped because lastHttpContentReceived is "
                            + handlerCtx.channel().attr(LAST_HTTP_CONTENT_RECEIVED_KEY).get() + " and 100ContinueMessage is "
                            + handlerCtx.channel().attr(RESPONSE_100_CONTINUE_MESSAGE).get());
        }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not required in new changes

@@ -463,18 +464,28 @@ public void cancel() {
private void notifyIfResponseNotCompleted(ChannelHandlerContext handlerCtx) {
RequestContext requestCtx = handlerCtx.channel().attr(REQUEST_CONTEXT_KEY).get();
Boolean responseCompleted = handlerCtx.channel().attr(RESPONSE_COMPLETE_KEY).get();
Boolean lastHttpContentReceived = handlerCtx.channel().attr(LAST_HTTP_CONTENT_RECEIVED_KEY).get();
boolean isLastByteWithout100Continue = isLastByteWithout100Response(handlerCtx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a more explanatory name for this boolean? I think the channel key name is correct but here it'd be great if you'd know why a 100 continue disables the lastbyte check. isLastByteForResponseBody? (not sure if this is 100% correct)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with new PR to remove LastHttpContentHandler.

@@ -137,6 +160,20 @@ private Throwable captureException() {
throw new AssertionError("Call did not fail as expected.");
}

private Throwable captureExceptionWithHttpContinueResponse() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Consider adding the future as a parameter to captureException and keep it to one method

Comment on lines +211 to +228
.responseHandler(new SdkAsyncHttpResponseHandler() {
private SdkHttpResponse headers;

@Override
public void onHeaders(SdkHttpResponse headers) {
this.headers = headers;
}

@Override
public void onStream(Publisher<ByteBuffer> stream) {
Flowable.fromPublisher(stream).forEach(b -> {
});
}

@Override
public void onError(Throwable error) {
}
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Could be simplified by extracting the response handler

@@ -196,6 +197,7 @@ private void configureChannel() {
channel.attr(REQUEST_CONTEXT_KEY).set(context);
channel.attr(RESPONSE_COMPLETE_KEY).set(false);
channel.attr(LAST_HTTP_CONTENT_RECEIVED_KEY).set(false);
channel.attr(RESPONSE_100_CONTINUE_MESSAGE).set(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to mark the completion of request in HttpStreamHandler and get rid of LastHttpContentHandler? That way, we don't need to handle 100 continue message this classe

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raised #3535 to change the logic where LAST_HTTP_CONTENT_RECEIVED_KEY is set

@joviegas
Copy link
Contributor Author

joviegas commented Nov 3, 2022

Closing since I have raised new PR #3535

@joviegas joviegas closed this Nov 3, 2022
@joviegas joviegas deleted the joviegas/last_byte branch December 20, 2022 00:38
aws-sdk-java-automation added a commit that referenced this pull request Dec 16, 2024
…40713e5ab

Pull request: release <- staging/cccb42d8-dd62-4d2f-8e1c-cc140713e5ab
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants