-
Notifications
You must be signed in to change notification settings - Fork 0
Thread safe sentinel #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In the current version, the sentinel code tries to close all connections immediately after discovering there is a new master. This is a problem in multi-threaded environment, because neither `ConnectionPool.disconnect` nor `Connection.disconnect` are thread-safe. If you call `SentinelConnectionPool.disconnect` after master failover, that will close all connections that are potentially used from other threads, causing all kinds of errors. This change avoids that behavior by adding acquire/release checks, so connections that don't belong the current master are never returned to the pool and they are closed instead.
38a0157
to
bbf553c
Compare
self.get_master_address() | ||
return False | ||
if self.is_master: | ||
if self.master_address != (connection.host, connection.port): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't a similar check be made for the case when self.is_master
is False
? (i.e. you called slave_for
before and the slave you are connected to changed to master
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slave_for
can actually also give you a master connection (only as a fallback, but it still can), so I consider slave_for
as "give me a connection to any redis server, preferably slave" and don't think we need to disconnect such connections
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay then.
def _check_connection(self, connection): | ||
if connection.to_be_disconnected: | ||
connection.disconnect() | ||
self.get_master_address() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This resolves the master every every time when a connection is returned to the pool, which might be several times (one for each connection in the pool) after a master failover. This could probably be avoided if we kept track of the "generation" (count how many times the failover has happened) and not doing the refresh if the connection is not from the last generation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question is, is it a problem? During a failover we do 2*n resolves instead of n.
There is currently no good place to watch for the master change. This to_be_disconnected
flag is basically just a hint.
The correct solution would be to subscribe to the sentinel pub/sub, watch for master changes and do the generation change immediately after that happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't cause any problems.
redis#909