Skip to content

Allow immediate request failure on connection error #625

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

dnadoba
Copy link
Collaborator

@dnadoba dnadoba commented Sep 6, 2022

Motivation

In integration tests we often want to inspect the connection error thrown during connection establishment for a request. Currently the only way to get at the connection error is to wait until the request timeout is expired. This is because of the connection retry behaviour backed into AHC which cannot be turned off. AHC will try to create new connections on connection failure with an exponential backoff. It stores the last connection and will fail the request if the request times out.

Modification

Adds a new internal flag called retryConnectionEstablishment to HTTPClient.Configuration.ConnectionPool.
If false, requests will fail immediately after a connection could not be established. Otherwise the old behavior is used which is still the default.

Set retryConnectionEstablishment to false to speedup test from ~40s to ~1s in HTTPClientNIOTSTests.swift. More tests will follow.

Result

Allows fast and reliable test execution.

@dnadoba dnadoba added the 🔨 semver/patch No public API change. label Sep 6, 2022
@Lukasa Lukasa requested a review from fabianfett September 6, 2022 16:45
`AsyncAwaitEndToEndTests.testImmediateDeadline` failed in CI. Hard to debug without more information. Doesn’t fail locally with 1000 repetitions.
Comment on lines 262 to 268
guard self.retryConnectionEstablishment else {
return .init(
request: self.failAllRequests(reason: error),
connection: .none
)
}

Copy link
Member

@fabianfett fabianfett Oct 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is weird! If we always fail all requests, when we a connection attempt failed, we should never reschedule a connection attempt. And this means we should never reach this point. Since we never schedule a backoff timer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

waitingForConnectivity is called if a NWConnection transitions into the waiting state which has nothing to do with our own connection retry logic. This means that it will be called if a NWConnection "fails" instead of failedToCreateNewConnection because the connection will do its own retry thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please add a code comment, that this is only triggered in the Network.framework flow?

If we don't want to retry connection establishment, NW.framework should not go into the waiting state, should it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the code again. We have a different flag (networkFrameworkWaitForConnectivity) for disabling Network.frameworks waiting behaviour.

Copy link
Member

@fabianfett fabianfett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And a bit more...

@@ -219,6 +223,17 @@ extension HTTPConnectionPool {

switch self.lifecycleState {
case .running:
guard self.retryConnectionEstablishment else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using the guard here it might be nicer to go with where clause in the switch...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A where constraint only applies to the last case if multiple cases are defined e.g.

case .running, .shuttingDown where self.retryConnectionEstablishment == true:

the where constrain only applies to .shuttingDown but not .running. We had an issue in AHC which we debugged together and decided we should stop using where clauses in switch cases altogether because it is not obvious.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, however here we only have one case here. Anyway, not a hill I want to die on. If we want to stick with an inline if/guard, I would prefer this to be an if self.retryConnectionEstablishment {} and then move the retry logic into the if.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit of the if? Guard at least guarantees that we return, if can fall through which we don't want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider it nicer to read. But that's personal...

@@ -401,13 +405,38 @@ extension HTTPConnectionPool {
self.failedConsecutiveConnectionAttempts += 1
self.lastConnectFailure = error

guard self.retryConnectionEstablishment else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we check the current running state here? Like in the http1 case? This seems weird. I think we should add this here. If the pool is closing, we should not backoff anymore.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah agree this is weird but has nothing to do with this change as we don't check the state at all. We should tackle this in a separate PR as we likely want to add test.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dnadoba dnadoba requested a review from fabianfett October 10, 2022 09:28
Copy link
Member

@fabianfett fabianfett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need TODOs for the incorrect http2 shutdown behavior, and we should change handling in waiting for NW.framework and/or document it better.

@@ -219,6 +223,17 @@ extension HTTPConnectionPool {

switch self.lifecycleState {
case .running:
guard self.retryConnectionEstablishment else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, however here we only have one case here. Anyway, not a hill I want to die on. If we want to stick with an inline if/guard, I would prefer this to be an if self.retryConnectionEstablishment {} and then move the retry logic into the if.

Comment on lines 262 to 268
guard self.retryConnectionEstablishment else {
return .init(
request: self.failAllRequests(reason: error),
connection: .none
)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please add a code comment, that this is only triggered in the Network.framework flow?

If we don't want to retry connection establishment, NW.framework should not go into the waiting state, should it?

Copy link
Member

@fabianfett fabianfett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pushing through!

@dnadoba dnadoba merged commit f17a47e into swift-server:main Oct 10, 2022
@dnadoba dnadoba deleted the dn-allow-disabling-connection-retry branch October 10, 2022 12:34
@dnadoba dnadoba restored the dn-allow-disabling-connection-retry branch October 10, 2022 12:34
@dnadoba dnadoba deleted the dn-allow-disabling-connection-retry branch October 10, 2022 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔨 semver/patch No public API change.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants