Thread safe sentinel #1

lalinsky · 2017-10-01T08:27:27Z

In the current version, the sentinel code tries to close all connections immediately after discovering there is a new master. This is a problem in multi-threaded environment, because neither `ConnectionPool.disconnect` nor `Connection.disconnect` are thread-safe. If you call `SentinelConnectionPool.disconnect` after master failover, that will close all connections that are potentially used from other threads, causing all kinds of errors. This change avoids that behavior by adding acquire/release checks, so connections that don't belong the current master are never returned to the pool and they are closed instead.

ms7s · 2017-10-04T08:06:18Z

redis/sentinel.py

+            self.get_master_address()
+            return False
+        if self.is_master:
+            if self.master_address != (connection.host, connection.port):


Shouldn't a similar check be made for the case when self.is_master is False? (i.e. you called slave_for before and the slave you are connected to changed to master)

slave_for can actually also give you a master connection (only as a fallback, but it still can), so I consider slave_for as "give me a connection to any redis server, preferably slave" and don't think we need to disconnect such connections

Ah, okay then.

ms7s · 2017-10-04T08:12:51Z

redis/sentinel.py

+    def _check_connection(self, connection):
+        if connection.to_be_disconnected:
+            connection.disconnect()
+            self.get_master_address()


This resolves the master every every time when a connection is returned to the pool, which might be several times (one for each connection in the pool) after a master failover. This could probably be avoided if we kept track of the "generation" (count how many times the failover has happened) and not doing the refresh if the connection is not from the last generation.

The question is, is it a problem? During a failover we do 2*n resolves instead of n.

There is currently no good place to watch for the master change. This to_be_disconnected flag is basically just a hint.

The correct solution would be to subscribe to the sentinel pub/sub, watch for master changes and do the generation change immediately after that happens.

Shouldn't cause any problems.

lalinsky requested review from capkovic, vetyy and ms7s October 1, 2017 08:27

lalinsky force-pushed the thread-safe-sentinel branch from 38a0157 to bbf553c Compare October 2, 2017 15:26

ms7s approved these changes Oct 4, 2017

View reviewed changes

ms7s reviewed Oct 4, 2017

View reviewed changes

lalinsky merged commit bbf553c into exponea:2.10.6-exponea Oct 4, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Thread safe sentinel #1

Thread safe sentinel #1

Uh oh!

lalinsky commented Oct 1, 2017 •

edited

Loading

Uh oh!

ms7s Oct 4, 2017

Uh oh!

lalinsky Oct 4, 2017

Uh oh!

ms7s Oct 4, 2017

Uh oh!

ms7s Oct 4, 2017

Uh oh!

lalinsky Oct 4, 2017

Uh oh!

ms7s Oct 4, 2017

Uh oh!

Uh oh!

Thread safe sentinel #1

Thread safe sentinel #1

Uh oh!

Conversation

lalinsky commented Oct 1, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ms7s Oct 4, 2017

Choose a reason for hiding this comment

Uh oh!

lalinsky Oct 4, 2017

Choose a reason for hiding this comment

Uh oh!

ms7s Oct 4, 2017

Choose a reason for hiding this comment

Uh oh!

ms7s Oct 4, 2017

Choose a reason for hiding this comment

Uh oh!

lalinsky Oct 4, 2017

Choose a reason for hiding this comment

Uh oh!

ms7s Oct 4, 2017

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lalinsky commented Oct 1, 2017 •

edited

Loading