-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Fix MOVED errors by not randomly selecting another node #1001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@vmihailenco thoughts? |
True, and go-redis will follow MOVED/ASK error and send request to the right node on next retry. What you are proposing makes sense but
|
It fixes it by not creating artificial moved errors. Networking in AWS is pretty flaky and we were getting consistent retryable errors that would lead to forced moved errors for no reason at all. This obviously won't fix the retryable errors (i/o timeouts) but will prevent us from wasting more round trips to redis. Moreover, without this fix the end behaviour isn't different in the case of a node timing out everything: the node may still keep timing out but a random hop will be performed that just delays the inevitable, going back to the same original node.
The code seems to already lazily reload the state if any error has occurred so I don't think that's the case. Moreover, I don't think it's ideal to depend on a randomly selecting nodes to detect if a node is down and the topology has changed. Thoughts? |
I am experiencing this problem as well, and I believe this would fix this issue. @vmihailenco I believe this PR is not about fixing timeouts in the Redis client, but this PR is more about making retries in a clustered Redis actually hit nodes that store the information you're trying to operate on. We might as well speed up this process instead of wasting retries asking around for where the information is. |
@vmihailenco we've hit this issue again in production, anyway we can get this merged? |
Just make this super clear. Current behavior:
Suggested behavior:
I don't like this change because:
|
At scale cloud networks are generally unreliable. Thus with the current behaviour we consistently get incorrect MOVED errors. This gets pretty bad at scale. The suggested behaviour is to retry on the same node but with a potentially different connection. We currently set the retry to 3 because 8 is way too many attempts. Essentially the best behaviour is to fail fast. The current behaviour doesn't actually give any benefit other than causing MOVED errors. MOVED errors should never happen unless the cluster has failed over. That's when you should refresh the cluster topology.
Until context support recently landed this was the only way to ensure minimal blocking on Redis calls. A larger timeout is not acceptable to meet our latency requirement which is the reason we're using Redis in the first place.
Just because a connection has timed out doesn't mean the topology has changed. Only if a MOVED error is received should the topology change (or at an interval refetching the slots). Trying random nodes does not help with the topology case at all. It only causes artificial MOVED errors and puts more pressure on the cluster (by making needless redirects). The reason I say needless is because for writes there are no other nodes in the cluster that can serve the request other than the current node. It's best to fail fast in this case. If the topology has changed (the current node has indeed failed) then on the next state refresh you'll have the new cluster slots. You're optimizing for a rare case (failover) by making the common case (networking blips) worse. The same happens for reads but there you can retry another replica instead of hitting the same node. This is more correct behaviour than hitting a random node because there's a higher chance another replica of the same slot can serve the request. Hitting random nodes for reads just causes more network hops and more MOVED errors. If a replica has failed, trying another replica in the same slot is the best path forward. tl;dr depending on network timeouts to detect failure leads to more issues than it solves, especially in a cloud environment at scale. |
PTAL at #1056 |
After some investigations we noticed a lot of MOVED errors that always seemed to happen when we received i/o timeouts. The original request selected the correct master node (for writes) and the correct master/slave for reads, however, if an i/o timeout or otherwise retryable error is received, a random node will be used after two attempts on the original node. This seems wrong as a random node will never be suitable to serve the request. Redis will most likely always send back MOVED errors as the random node is either not a master, not the right master, or not the right slave.
For writes, you cannot attempt on any other node than the currently selected master node. Best path here is probably around respecting the redirect configuration and keep retrying on that node.
For reads we can check if
ReadOnly
is set and retry on another slave. Otherwise we have to behave the same as writes.Let me know if something along these lines would be acceptable and I can add some tests.