Skip to content

HTTPClientResponseDelegate is not notified of some errors (connection related?). #242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kuwerty opened this issue Jun 13, 2020 · 12 comments · Fixed by #245
Closed

HTTPClientResponseDelegate is not notified of some errors (connection related?). #242

kuwerty opened this issue Jun 13, 2020 · 12 comments · Fixed by #245
Assignees
Labels
kind/bug Feature doesn't work as expected.
Milestone

Comments

@kuwerty
Copy link

kuwerty commented Jun 13, 2020

Tested against commit 8add6b8 on Ubuntu 16.04 with Swift version 5.2.4 (swift-5.2.4-RELEASE)

I've been debugging an issue where an HTTPClientResponseDelegate does not get notified of some errors via the didReceiveError(task:...) method.

My expectation was that HTTPClientResponseDelegate would see any error throughout the entire lifecycle of the request, not just some subset of them.

I've pasted a test case below that demonstrates the issue with remoteConnectionClosed errors but I've seen it with connectTimeout too.

I can appreciate that the error was returned in Task.futureResult but it took quite a bit of effort to work that out (no matter how obvious it is in retrospect). However if the current behaviour is intended then I would still push that the documentation for didReceiveError is updated to emphasize it only sees some subset of errors and not "any network-related error" as it states now.

Thank you!

    func testDelegateConnectionFailError() {
        class TestDelegate: HTTPClientResponseDelegate {
            typealias Response = Void

            var error : Error?

            func didFinishRequest(task: HTTPClient.Task<Response>) throws {}

            func didReceiveError(task: HTTPClient.Task<Response>, _ error: Error) -> EventLoopFuture<Void> {
                self.error = error

                return task.eventLoop.makeSucceededFuture(())
            }
        }

        let localHTTPBin = HTTPBin(refusesConnections: true)
        let localClient = HTTPClient(eventLoopGroupProvider: .shared(self.clientGroup))
        defer {
            XCTAssertNoThrow(try localClient.syncShutdown())
            XCTAssertNoThrow(try localHTTPBin.shutdown())
        }

        let delegate = TestDelegate()

        do {
            let request = try HTTPClient.Request(url: "http://localhost:\(localHTTPBin.port)/get")
            try localClient.execute(request: request, delegate: delegate).wait()
            XCTFail("Shouldn't succeed")
        } catch {
            print("error: \(error)")
            print("delegate.error: \(String(describing: delegate.error))")
            XCTAssert(delegate.error != nil)
        }
    }

error: HTTPClientError.remoteConnectionClosed
delegate.error: nil
/data/async-http-client/Tests/AsyncHTTPClientTests/HTTPClientTests.swift:1569: error: HTTPClientTests.foo : XCTAssertTrue failed -
@weissi weissi added kind/bug Feature doesn't work as expected. ⚠️ release blocker labels Jun 13, 2020
@weissi weissi added this to the 1.2.0 milestone Jun 13, 2020
@weissi
Copy link
Contributor

weissi commented Jun 13, 2020

I do share your expectation and I agree the delegate should get all errors. This is quite a serious bug so this needs fixing before 1.2.0.

Thank you so much for putting together the test for this!

@artemredkin artemredkin self-assigned this Jun 13, 2020
@artemredkin
Copy link
Collaborator

artemredkin commented Jun 13, 2020

I think the error is not in the client, but in the delegate, it should be:

func didReceiveError(task: HTTPClient.Task<Response>, _ error: Error) {
    self.error = error
}

and not:

func didReceiveError(task: HTTPClient.Task<Response>, _ error: Error) -> EventLoopFuture<Void> {
}

Delegate is method is not called because its not overriding it, it seems having default implementation here can hide errors...

@artemredkin artemredkin removed kind/bug Feature doesn't work as expected. ⚠️ release blocker labels Jun 13, 2020
@artemredkin
Copy link
Collaborator

@weissi do you think we should drop default implementations in 2.0.0?

@kuwerty
Copy link
Author

kuwerty commented Jun 13, 2020

Oops. You're absolutely right, I messed that test up completely. Apologies!

I've managed to reproduce the same thing with a corrected delegate. I think this is much closer to the actual failure mode we see in our real application.

What we see in our application is a connectTimeout error when talking to S3. This happens rarely, maybe every 10k requests or so. It isn't unusual behaviour for S3, we've see the same thing on a range of clients/languages and normally you just retry the request and it succeeds.

So what I've done is written a stupid HTTP server in C with some sleeps that are designed to provoke the connectTimeout error in NIO.

With the correct didReceiveError function signature you can see the first few requests return readTimeout (you'd expect those errors given the sleep in the server). You can then see a bunch of requests where the futureResult error is connectTimeout but the delegate did not see any error.

error:HTTPClientError.readTimeout vs delegate:Optional(HTTPClientError.readTimeout)
error:HTTPClientError.readTimeout vs delegate:Optional(HTTPClientError.readTimeout)
error:HTTPClientError.readTimeout vs delegate:Optional(HTTPClientError.readTimeout)
error:HTTPClientError.readTimeout vs delegate:Optional(HTTPClientError.readTimeout)
error:HTTPClientError.readTimeout vs delegate:Optional(HTTPClientError.readTimeout)
error:HTTPClientError.readTimeout vs delegate:Optional(HTTPClientError.readTimeout)
error:connectTimeout(NIO.TimeAmount(nanoseconds: 10000000000)) vs delegate:nil
error:connectTimeout(NIO.TimeAmount(nanoseconds: 10000000000)) vs delegate:nil
error:HTTPClientError.readTimeout vs delegate:Optional(HTTPClientError.readTimeout)
error:connectTimeout(NIO.TimeAmount(nanoseconds: 10000000000)) vs delegate:nil

Revised test code:


    func testDelegateConnectionFailError() {
        class TestDelegate: HTTPClientResponseDelegate {
            typealias Response = Void

            var error : Error?

            func didReceiveHead(task: HTTPClient.Task<Response>, _ head: HTTPResponseHead) -> EventLoopFuture<Void> {
                return task.eventLoop.makeSucceededFuture(())
            }

            func didFinishRequest(task: HTTPClient.Task<Response>) throws {}

            func didReceiveError(task: HTTPClient.Task<Response>, _ error: Error) {
                self.error = error
            }
        }


        let timeout = AsyncHTTPClient.HTTPClient.Configuration.Timeout(connect: .seconds(3), read: .seconds(3))

        let configuration = HTTPClient.Configuration(timeout: timeout)

        let localClient = HTTPClient(eventLoopGroupProvider: .shared(self.clientGroup),
                                     configuration: configuration)

        defer {
            XCTAssertNoThrow(try localClient.syncShutdown())
        }

        func single(_ url:String) -> HTTPClient.Task<Void> {
            let delegate = TestDelegate()

            // let request = try HTTPClient.Request(url: "http://localhost:\(localHTTPBin.port)/get")
            // let request = try HTTPClient.Request(url: "http://www.google.com:81")
            let request = try! HTTPClient.Request(url: url)
            let task = localClient.execute(request: request, delegate: delegate)

            task.futureResult.whenComplete { result in
                switch result {
                    case .failure(let error):
                        print("error:\(error) vs delegate:\(String(describing: delegate.error))")

                        XCTAssert(delegate.error != nil)

                    case .success:
                        break
                }
            }

            return task
        }

        var tasks = [HTTPClient.Task<Void>]()

        for _ in 0..<10 {
            let task = single("http://localhost:8080")

            tasks.append(task)
        }

        for task in tasks {
            try? task.wait()
        }
    }


Server code:


#include <stdio.h>
#include <netinet/in.h>
#include <stdlib.h>
#include <sys/socket.h>
#include <string.h>
#include <unistd.h>
#define PORT 8080
#define SA struct sockaddr

int main()
{
  int sockfd, connfd;
  struct sockaddr_in servaddr = {};

  // socket create and verification
  sockfd = socket(AF_INET, SOCK_STREAM, 0);
  if(sockfd == -1) {
    printf("socket creation failed...\n");
    exit(0);
  }
  
  int tr=1;
  if (setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &tr, sizeof(int)) == -1) {
      perror("setsockopt");
      exit(1);
  }

  // assign IP, PORT
  servaddr.sin_family = AF_INET;
  servaddr.sin_addr.s_addr = htonl(INADDR_ANY);
  servaddr.sin_port = htons(PORT);

  // Binding newly created socket to given IP and verification
  if ((bind(sockfd, (SA*)&servaddr, sizeof(servaddr))) != 0) {
    printf("Failed to bind socket\n");
    exit(0);
  }

  // Now server is ready to listen and verification
 if ((listen(sockfd, 4)) != 0) {
   printf("Listen failed\n");
   exit(0);
 }

  printf("Server listening...\n");

  while(1) {
    // Accept the data packet from client and verification 
    struct sockaddr_in cli = {};
    socklen_t len;
    connfd = accept(sockfd, (SA*)&cli, &len); 
    if (connfd < 0) { 
        printf("server acccept failed...\n"); 
        exit(0); 
    } 
    else
        printf("server acccept the client...\n"); 

    sleep(4);

    // Function for chatting between client and server 
    const char * msg = "HTTP/1.0 200 OK\nContent-Length:5\n\nHello";
    write(connfd, msg, strlen(msg));

    close(connfd);

    sleep(4);
  }
}


@weissi
Copy link
Contributor

weissi commented Jun 13, 2020

@weissi do you think we should drop default implementations in 2.0.0?

Hmm, I think Swift should get its act together and offer a implements/override func for implementing protocol requirements: https://bugs.swift.org/browse/SR-13007 :). It's quite annoying having to implement all methods...

@artemredkin
Copy link
Collaborator

Ah, that does indeed look like bug, I think I even can see there reason for it.

@artemredkin
Copy link
Collaborator

Good catch, @kuwerty, thank you!

@kuwerty
Copy link
Author

kuwerty commented Jun 14, 2020

That's awesome, thanks for the speedy response!

@weissi
Copy link
Contributor

weissi commented Jun 14, 2020

Cool, this server can be super straightforwardly implemented in NIO too if that helps for a test case. Happy to write the server code

@kuwerty
Copy link
Author

kuwerty commented Jun 16, 2020

@weissi are you looking for a test case that provokes the connectTimeout error specifically? I took a stab at that in NIO, using HTTPBin, but I just couldn't get it to result in the same error. If you have a good idea on how to do that then go ahead ;-)

The PR from @artemredkin forces a connection refused by shutting down HTTPBin early, it looks like the same class of error with the same behaviour in the delegate.

@weissi
Copy link
Contributor

weissi commented Jun 16, 2020

@kuwerty I don't think HTTPBin can do it (because it's a web server) but this is a pretty direct translation of your C code:

import NIO

let group = MultiThreadedEventLoopGroup(numberOfThreads: 1)
defer {
    try! group.syncShutdownGracefully()
}

final class DelayReadByFourSecondsHandler: ChannelDuplexHandler {
    typealias InboundIn = Channel
    typealias OutboundIn = Never

    private var numberOfReadsSeen = 0
    func channelRead(context: ChannelHandlerContext, data: NIOAny) {
        print("server accepted connection from \(self.unwrapInboundIn(data).remoteAddress!)")
        context.fireChannelRead(data)
    }

    func read(context: ChannelHandlerContext) {
        defer {
            self.numberOfReadsSeen += 1
        }
        if self.numberOfReadsSeen == 0 {
            context.read() // First read goes through straight away
        } else {
            // All subsequent reads need to wait 8s (4 + 4).
            context.eventLoop.scheduleTask(in: .seconds(8)) {
                context.read()
            }
        }
    }
}

final class SendBackCannedResponseHandler: ChannelInboundHandler {
    typealias InboundIn = ByteBuffer
    typealias OutboundOut = ByteBuffer

    func channelActive(context: ChannelHandlerContext) {
        // Four seconds after we accepted a connection, send back some bytes.
        context.eventLoop.scheduleTask(in: .seconds(4)) {
            let buffer = context.channel.allocator.buffer(string: "HTTP/1.0 200 OK\nContent-Length:5\n\nHello")
            context.writeAndFlush(self.wrapOutboundOut(buffer)).whenComplete { _ in
                context.channel.close(promise: nil)
            }
        }
    }
}

let server = try ServerBootstrap(group: group)
    .serverChannelOption(ChannelOptions.backlog, value: 4) // Backlog 4
    .serverChannelOption(ChannelOptions.maxMessagesPerRead, value: 1) // Accept only 1 connection at a time.
    .serverChannelInitializer { channel in
        // In the server channel (where we accept connections), we want to delay the accepting.
        channel.pipeline.addHandler(DelayReadByFourSecondsHandler())
    }
    .childChannelInitializer { channel in
        // In the client channel (the actual TCP connection), we want to send back a canned response.
        channel.pipeline.addHandler(SendBackCannedResponseHandler())
    }
    .bind(to: SocketAddress(ipAddress: "0.0.0.0", port: 8080))
    .wait()
print("Server up and running at \(server.localAddress!)")

try server.closeFuture.wait() // wait for the server to close (never).

into NIO. Could you see if that reproduces your problem in the same way?

@kuwerty
Copy link
Author

kuwerty commented Jun 17, 2020

@weissi That was super helpful. I've got a test that fails before artemredkin's fix and succeeds after: 80d03b4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Feature doesn't work as expected.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants