-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Closed
Description
To simulate a network failure we reboot both the primary and replica nodes in an Azure Cache for Redis instance and have found that the library reacts differently based on the host it is deployed to.
Application
- .NET 5.0 app (uses a factory and lazy implementation for Redis Connection)
- StackExchange.Redis 2.2.62
Expected Result
- Both nodes go down at the same time (or within a small time window).
- The application will report
StackExchange.Redis.RedisConnectionException
exceptions. - The nodes will restart and be available approximately 1 minute after they go down.
- The library will reconnect approximately 1 minute after the nodes went down.
Windows & Docker on Windows Result
The application reconnects approximately 1 minute after the nodes went down as expected.
Error:
StackExchange.Redis.RedisConnectionException: No connection is active/available to service this operation: SET N4BDN; It was not possible to connect to the redis server(s). There was an authentication failure; check that passwords (or client certificates) are configured correctly. ConnectTimeout, mc: 1/1/0, mgr: 10 of 10 available, clientName: 02cbef6fa5b6, IOCP: (Busy=0,Free=1000,Min=200,Max=1000), WORKER: (Busy=1,Free=32766,Min=200,Max=32767), v: 2.2.62.27853
---> StackExchange.Redis.RedisConnectionException: It was not possible to connect to the redis server(s). There was an authentication failure; check that passwords (or client certificates) are configured correctly. ConnectTimeout
--- End of inner exception stack trace ---
at StackExchange.Redis.ConnectionMultiplexer.ThrowFailed[T](TaskCompletionSource`1 source, Exception unthrownException) in /_/src/StackExchange.Redis/ConnectionMultiplexer.cs:line 2802
--- End of stack trace from previous location ---
Load Test Result
Linux Result
The application throws TimeoutExceptions and does not reconnect for 15 minutes.
Error:
StackExchange.Redis.RedisTimeoutException: Timeout awaiting response (outbound=0KiB, inbound=0KiB, 5570ms elapsed, timeout is 5000ms), command=SET, next: SET FAO1X, inst: 0, qu: 0, qs: 12, aw: False, rs: ReadAsync, ws: Idle, in: 0, serverEndpoint: <instancename>.redis.cache.windows.net:6380, mc: 1/1/0, mgr: 10 of 10 available, clientName: SandboxHost-637654330433879470, IOCP: (Busy=0,Free=1000,Min=200,Max=1000), WORKER: (Busy=2,Free=32765,Min=200,Max=32767), v: 2.2.62.27853 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
Load Test Result
Observations
- When running on a Linux server you can update the sysctl setting
net.ipv4.tcp_retries2
. This setting decides the total time before a connection failure is declared. Lowering this setting to '5', I found that the application threw the correct type of errorsStackExchange.Redis.RedisConnectionException
and reconnected approximately 1 minute after the nodes went down. The downside to making this change is that it is a TCP setting for the server and if have multiple applications running on that server, they are all affected. - Installing Docker on the Linux server, updating the sysctl setting
net.ipv4.tcp_retries2
to 5 and running the application as a container did not reconnect quickly. Updating the setting did not have any impact when the application reconnected. It reconnected after 15 minutes. - Following the Best Practices guide, you should be implementing a ForceReconnect method to handle these types of scenarios. The documentation also says,
Don't call ForceReconnect for Timeouts, just for RedisConnectionExceptions or SocketExceptions
- In this situation, when the application is running on Linux it throws TimeoutExceptions, which the documentation says do not call the ForceReconnect code.
Questions
- Is this something that can be handled or improved in the StackExchange.Redis library?
- Are there Best Practice TCP Settings that should be used when running on Linux?
- Should the Best Practice be to call ForceReconnect on TimeoutExceptions when running on Linux and also when you encounter RedisConnectionExceptions?
Referenced Issues
CaveSven, botinko, RuiCostafrg, leloscheidt, andreas-schroeder and 10 more
Metadata
Metadata
Assignees
Labels
No labels