Skip to content

Commit 387150a

Browse files
haaawkpiodul
andauthored
Merge pull request #1114 from haaawk/stream_ids_fix
Stop reusing stream ids of requests that have timed out due to client-side timeout (#1114) * ResponseFuture: do not return the stream ID on client timeout When a timeout occurs, the ResponseFuture associated with the query returns its stream ID to the associated connection's free stream ID pool - so that the stream ID can be immediately reused by another query. However, that it incorrect and dangerous. If query A times out before it receives a response from the cluster, a different query B might be issued on the same connection and stream. If response for query A arrives earlier than the response for query B, the first one might be misinterpreted as the response for query B. This commit changes the logic so that stream IDs are not returned on timeout - now, they are only returned after receiving a response. * Connection: fix tracking of in_flight requests This commit fixes tracking of in_flight requests. Before it, in case of a client-side timeout, the response ID was not returned to the pool, but the in_flight counter was decremented anyway. This counter is used to determine if there is a need to wait for stream IDs to be freed - without this patch, it could happen that the driver throught that it can initiate another request due to in_flight counter being low, but there weren't any free stream IDs to allocate, so an assertion was triggered and the connection was defuncted and opened again. Now, requests timed out on the client side are tracked in the orphaned_request_ids field, and the in_flight counter is decremented only after the response is received. * Connection: notify owning pool about released orphaned streams Before this patch, the following situation could occur: 1. On a single connection, multiple requests are spawned up to the maximum concurrency, 2. We want to issue more requests but we need to wait on a condition variable because requests spawned in 1. took all stream IDs and we need to wait until some of them are freed, 3. All requests from point 1. time out on the client side - we cannot free their stream IDs until the database node responds, 4. Responses for requests issued in point 1. arrive, but the Connection class has no access to the condition variable mentioned in point 2., so no requests from point 2. are admitted, 5. Requests from point 2. waiting on the condition variable time out despite there are stream IDs available. This commit adds an _on_orphaned_stream_released field to the Connection class, and now it notifies the owning pool in case a timed out request receives a late response and a stream ID is freed by calling _on_orphaned_stream_released callback. * HostConnection: implement replacing overloaded connections In a situation of very high overload or poor networking conditions, it might happen that there is a large number of outstanding requests on a single connection. Each request reserves a stream ID which cannot be reused until a response for it arrives, even if the request already timed out on the client side. Because the pool of available stream IDs for a single connection is limited, such situation might cause the set of free stream IDs to shrink to a very small size (including zero), which will drastically reduce the available concurrency on the connection, or even render it unusable for some time. In order to prevent this, the following strategy is adopted: when the number of orphaned stream IDs reaches a certain threshold (e.g. 75% of all available stream IDs), the connection becomes marked as overloaded. Meanwhile, a new connection is opened - when it becomes available, it replaces the old one, and the old connection is moved to "trash" where it waits until all its outstanding requests either respond or time out. This feature is implemented for HostConnection but not for HostConnectionPool, which means that it will only work for clusters which use protocol v3 or newer. This fix is heavily inspired by the fix for JAVA-1519. Co-authored-by: Piotr Dulikowski <[email protected]>
1 parent 1759428 commit 387150a

File tree

6 files changed

+158
-30
lines changed

6 files changed

+158
-30
lines changed

cassandra/cluster.py

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4361,10 +4361,17 @@ def _on_timeout(self, _attempts=0):
43614361

43624362
pool = self.session._pools.get(self._current_host)
43634363
if pool and not pool.is_shutdown:
4364+
# Do not return the stream ID to the pool yet. We cannot reuse it
4365+
# because the node might still be processing the query and will
4366+
# return a late response to that query - if we used such stream
4367+
# before the response to the previous query has arrived, the new
4368+
# query could get a response from the old query
43644369
with self._connection.lock:
4365-
self._connection.request_ids.append(self._req_id)
4370+
self._connection.orphaned_request_ids.add(self._req_id)
4371+
if len(self._connection.orphaned_request_ids) >= self._connection.orphaned_threshold:
4372+
self._connection.orphaned_threshold_reached = True
43664373

4367-
pool.return_connection(self._connection)
4374+
pool.return_connection(self._connection, stream_was_orphaned=True)
43684375

43694376
errors = self._errors
43704377
if not errors:

cassandra/connection.py

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -690,6 +690,7 @@ class Connection(object):
690690

691691
# The current number of operations that are in flight. More precisely,
692692
# the number of request IDs that are currently in use.
693+
# This includes orphaned requests.
693694
in_flight = 0
694695

695696
# Max concurrent requests allowed per connection. This is set optimistically high, allowing
@@ -707,6 +708,20 @@ class Connection(object):
707708
# request_ids set
708709
highest_request_id = 0
709710

711+
# Tracks the request IDs which are no longer waited on (timed out), but
712+
# cannot be reused yet because the node might still send a response
713+
# on this stream
714+
orphaned_request_ids = None
715+
716+
# Set to true if the orphaned stream ID count cross configured threshold
717+
# and the connection will be replaced
718+
orphaned_threshold_reached = False
719+
720+
# If the number of orphaned streams reaches this threshold, this connection
721+
# will become marked and will be replaced with a new connection by the
722+
# owning pool (currently, only HostConnection supports this)
723+
orphaned_threshold = 3 * max_in_flight // 4
724+
710725
is_defunct = False
711726
is_closed = False
712727
lock = None
@@ -733,6 +748,8 @@ class Connection(object):
733748

734749
_is_checksumming_enabled = False
735750

751+
_on_orphaned_stream_released = None
752+
736753
@property
737754
def _iobuf(self):
738755
# backward compatibility, to avoid any change in the reactors
@@ -742,7 +759,7 @@ def __init__(self, host='127.0.0.1', port=9042, authenticator=None,
742759
ssl_options=None, sockopts=None, compression=True,
743760
cql_version=None, protocol_version=ProtocolVersion.MAX_SUPPORTED, is_control_connection=False,
744761
user_type_map=None, connect_timeout=None, allow_beta_protocol_version=False, no_compact=False,
745-
ssl_context=None):
762+
ssl_context=None, on_orphaned_stream_released=None):
746763

747764
# TODO next major rename host to endpoint and remove port kwarg.
748765
self.endpoint = host if isinstance(host, EndPoint) else DefaultEndPoint(host, port)
@@ -764,6 +781,8 @@ def __init__(self, host='127.0.0.1', port=9042, authenticator=None,
764781
self._io_buffer = _ConnectionIOBuffer(self)
765782
self._continuous_paging_sessions = {}
766783
self._socket_writable = True
784+
self.orphaned_request_ids = set()
785+
self._on_orphaned_stream_released = on_orphaned_stream_released
767786

768787
if ssl_options:
769788
self._check_hostname = bool(self.ssl_options.pop('check_hostname', False))
@@ -1188,11 +1207,22 @@ def process_msg(self, header, body):
11881207
decoder = paging_session.decoder
11891208
result_metadata = None
11901209
else:
1210+
need_notify_of_release = False
1211+
with self.lock:
1212+
if stream_id in self.orphaned_request_ids:
1213+
self.in_flight -= 1
1214+
self.orphaned_request_ids.remove(stream_id)
1215+
need_notify_of_release = True
1216+
if need_notify_of_release and self._on_orphaned_stream_released:
1217+
self._on_orphaned_stream_released()
1218+
11911219
try:
11921220
callback, decoder, result_metadata = self._requests.pop(stream_id)
11931221
# This can only happen if the stream_id was
11941222
# removed due to an OperationTimedOut
11951223
except KeyError:
1224+
with self.lock:
1225+
self.request_ids.append(stream_id)
11961226
return
11971227

11981228
try:

cassandra/pool.py

Lines changed: 80 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -390,6 +390,10 @@ def __init__(self, host, host_distance, session):
390390
# this is used in conjunction with the connection streams. Not using the connection lock because the connection can be replaced in the lifetime of the pool.
391391
self._stream_available_condition = Condition(self._lock)
392392
self._is_replacing = False
393+
# Contains connections which shouldn't be used anymore
394+
# and are waiting until all requests time out or complete
395+
# so that we can dispose of them.
396+
self._trash = set()
393397

394398
if host_distance == HostDistance.IGNORED:
395399
log.debug("Not opening connection to ignored host %s", self.host)
@@ -399,42 +403,59 @@ def __init__(self, host, host_distance, session):
399403
return
400404

401405
log.debug("Initializing connection for host %s", self.host)
402-
self._connection = session.cluster.connection_factory(host.endpoint)
406+
self._connection = session.cluster.connection_factory(host.endpoint, on_orphaned_stream_released=self.on_orphaned_stream_released)
403407
self._keyspace = session.keyspace
404408
if self._keyspace:
405409
self._connection.set_keyspace_blocking(self._keyspace)
406410
log.debug("Finished initializing connection for host %s", self.host)
407411

408-
def borrow_connection(self, timeout):
412+
def _get_connection(self):
409413
if self.is_shutdown:
410414
raise ConnectionException(
411415
"Pool for %s is shutdown" % (self.host,), self.host)
412416

413417
conn = self._connection
414418
if not conn:
415419
raise NoConnectionsAvailable()
420+
return conn
421+
422+
def borrow_connection(self, timeout):
423+
conn = self._get_connection()
424+
if conn.orphaned_threshold_reached:
425+
with self._lock:
426+
if not self._is_replacing:
427+
self._is_replacing = True
428+
self._session.submit(self._replace, conn)
429+
log.debug(
430+
"Connection to host %s reached orphaned stream limit, replacing...",
431+
self.host
432+
)
416433

417434
start = time.time()
418435
remaining = timeout
419436
while True:
420437
with conn.lock:
421-
if conn.in_flight < conn.max_request_id:
438+
if not (conn.orphaned_threshold_reached and conn.is_closed) and conn.in_flight < conn.max_request_id:
422439
conn.in_flight += 1
423440
return conn, conn.get_request_id()
424441
if timeout is not None:
425442
remaining = timeout - time.time() + start
426443
if remaining < 0:
427444
break
428445
with self._stream_available_condition:
429-
self._stream_available_condition.wait(remaining)
446+
if conn.orphaned_threshold_reached and conn.is_closed:
447+
conn = self._get_connection()
448+
else:
449+
self._stream_available_condition.wait(remaining)
430450

431451
raise NoConnectionsAvailable("All request IDs are currently in use")
432452

433-
def return_connection(self, connection):
434-
with connection.lock:
435-
connection.in_flight -= 1
436-
with self._stream_available_condition:
437-
self._stream_available_condition.notify()
453+
def return_connection(self, connection, stream_was_orphaned=False):
454+
if not stream_was_orphaned:
455+
with connection.lock:
456+
connection.in_flight -= 1
457+
with self._stream_available_condition:
458+
self._stream_available_condition.notify()
438459

439460
if connection.is_defunct or connection.is_closed:
440461
if connection.signaled_error and not self.shutdown_on_error:
@@ -461,6 +482,24 @@ def return_connection(self, connection):
461482
return
462483
self._is_replacing = True
463484
self._session.submit(self._replace, connection)
485+
else:
486+
if connection in self._trash:
487+
with connection.lock:
488+
if connection.in_flight == len(connection.orphaned_request_ids):
489+
with self._lock:
490+
if connection in self._trash:
491+
self._trash.remove(connection)
492+
log.debug("Closing trashed connection (%s) to %s", id(connection), self.host)
493+
connection.close()
494+
return
495+
496+
def on_orphaned_stream_released(self):
497+
"""
498+
Called when a response for an orphaned stream (timed out on the client
499+
side) was received.
500+
"""
501+
with self._stream_available_condition:
502+
self._stream_available_condition.notify()
464503

465504
def _replace(self, connection):
466505
with self._lock:
@@ -469,17 +508,23 @@ def _replace(self, connection):
469508

470509
log.debug("Replacing connection (%s) to %s", id(connection), self.host)
471510
try:
472-
conn = self._session.cluster.connection_factory(self.host.endpoint)
511+
conn = self._session.cluster.connection_factory(self.host.endpoint, on_orphaned_stream_released=self.on_orphaned_stream_released)
473512
if self._keyspace:
474513
conn.set_keyspace_blocking(self._keyspace)
475514
self._connection = conn
476515
except Exception:
477516
log.warning("Failed reconnecting %s. Retrying." % (self.host.endpoint,))
478517
self._session.submit(self._replace, connection)
479518
else:
480-
with self._lock:
481-
self._is_replacing = False
482-
self._stream_available_condition.notify()
519+
with connection.lock:
520+
with self._lock:
521+
if connection.orphaned_threshold_reached:
522+
if connection.in_flight == len(connection.orphaned_request_ids):
523+
connection.close()
524+
else:
525+
self._trash.add(connection)
526+
self._is_replacing = False
527+
self._stream_available_condition.notify()
483528

484529
def shutdown(self):
485530
with self._lock:
@@ -493,6 +538,16 @@ def shutdown(self):
493538
self._connection.close()
494539
self._connection = None
495540

541+
trash_conns = None
542+
with self._lock:
543+
if self._trash:
544+
trash_conns = self._trash
545+
self._trash = set()
546+
547+
if trash_conns is not None:
548+
for conn in self._trash:
549+
conn.close()
550+
496551
def _set_keyspace_for_all_conns(self, keyspace, callback):
497552
if self.is_shutdown or not self._connection:
498553
return
@@ -548,7 +603,7 @@ def __init__(self, host, host_distance, session):
548603

549604
log.debug("Initializing new connection pool for host %s", self.host)
550605
core_conns = session.cluster.get_core_connections_per_host(host_distance)
551-
self._connections = [session.cluster.connection_factory(host.endpoint)
606+
self._connections = [session.cluster.connection_factory(host.endpoint, on_orphaned_stream_released=self.on_orphaned_stream_released)
552607
for i in range(core_conns)]
553608

554609
self._keyspace = session.keyspace
@@ -652,7 +707,7 @@ def _add_conn_if_under_max(self):
652707

653708
log.debug("Going to open new connection to host %s", self.host)
654709
try:
655-
conn = self._session.cluster.connection_factory(self.host.endpoint)
710+
conn = self._session.cluster.connection_factory(self.host.endpoint, on_orphaned_stream_released=self.on_orphaned_stream_released)
656711
if self._keyspace:
657712
conn.set_keyspace_blocking(self._session.keyspace)
658713
self._next_trash_allowed_at = time.time() + _MIN_TRASH_INTERVAL
@@ -712,9 +767,10 @@ def _wait_for_conn(self, timeout):
712767

713768
raise NoConnectionsAvailable()
714769

715-
def return_connection(self, connection):
770+
def return_connection(self, connection, stream_was_orphaned=False):
716771
with connection.lock:
717-
connection.in_flight -= 1
772+
if not stream_was_orphaned:
773+
connection.in_flight -= 1
718774
in_flight = connection.in_flight
719775

720776
if connection.is_defunct or connection.is_closed:
@@ -750,6 +806,13 @@ def return_connection(self, connection):
750806
else:
751807
self._signal_available_conn()
752808

809+
def on_orphaned_stream_released(self):
810+
"""
811+
Called when a response for an orphaned stream (timed out on the client
812+
side) was received.
813+
"""
814+
self._signal_available_conn()
815+
753816
def _maybe_trash_connection(self, connection):
754817
core_conns = self._session.cluster.get_core_connections_per_host(self.host_distance)
755818
did_trash = False

tests/unit/.noseids

29.4 KB
Binary file not shown.

tests/unit/test_host_connection_pool.py

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,6 @@
2626
from cassandra.pool import Host, NoConnectionsAvailable
2727
from cassandra.policies import HostDistance, SimpleConvictionPolicy
2828

29-
3029
class _PoolTests(unittest.TestCase):
3130
PoolImpl = None
3231
uses_single_connection = None
@@ -45,7 +44,7 @@ def test_borrow_and_return(self):
4544
session.cluster.connection_factory.return_value = conn
4645

4746
pool = self.PoolImpl(host, HostDistance.LOCAL, session)
48-
session.cluster.connection_factory.assert_called_once_with(host.endpoint)
47+
session.cluster.connection_factory.assert_called_once_with(host.endpoint, on_orphaned_stream_released=pool.on_orphaned_stream_released)
4948

5049
c, request_id = pool.borrow_connection(timeout=0.01)
5150
self.assertIs(c, conn)
@@ -64,7 +63,7 @@ def test_failed_wait_for_connection(self):
6463
session.cluster.connection_factory.return_value = conn
6564

6665
pool = self.PoolImpl(host, HostDistance.LOCAL, session)
67-
session.cluster.connection_factory.assert_called_once_with(host.endpoint)
66+
session.cluster.connection_factory.assert_called_once_with(host.endpoint, on_orphaned_stream_released=pool.on_orphaned_stream_released)
6867

6968
pool.borrow_connection(timeout=0.01)
7069
self.assertEqual(1, conn.in_flight)
@@ -82,7 +81,7 @@ def test_successful_wait_for_connection(self):
8281
session.cluster.connection_factory.return_value = conn
8382

8483
pool = self.PoolImpl(host, HostDistance.LOCAL, session)
85-
session.cluster.connection_factory.assert_called_once_with(host.endpoint)
84+
session.cluster.connection_factory.assert_called_once_with(host.endpoint, on_orphaned_stream_released=pool.on_orphaned_stream_released)
8685

8786
pool.borrow_connection(timeout=0.01)
8887
self.assertEqual(1, conn.in_flight)
@@ -110,7 +109,7 @@ def test_spawn_when_at_max(self):
110109
session.cluster.get_max_connections_per_host.return_value = 2
111110

112111
pool = self.PoolImpl(host, HostDistance.LOCAL, session)
113-
session.cluster.connection_factory.assert_called_once_with(host.endpoint)
112+
session.cluster.connection_factory.assert_called_once_with(host.endpoint, on_orphaned_stream_released=pool.on_orphaned_stream_released)
114113

115114
pool.borrow_connection(timeout=0.01)
116115
self.assertEqual(1, conn.in_flight)
@@ -133,7 +132,7 @@ def test_return_defunct_connection(self):
133132
session.cluster.connection_factory.return_value = conn
134133

135134
pool = self.PoolImpl(host, HostDistance.LOCAL, session)
136-
session.cluster.connection_factory.assert_called_once_with(host.endpoint)
135+
session.cluster.connection_factory.assert_called_once_with(host.endpoint, on_orphaned_stream_released=pool.on_orphaned_stream_released)
137136

138137
pool.borrow_connection(timeout=0.01)
139138
conn.is_defunct = True
@@ -148,11 +147,12 @@ def test_return_defunct_connection_on_down_host(self):
148147
host = Mock(spec=Host, address='ip1')
149148
session = self.make_session()
150149
conn = NonCallableMagicMock(spec=Connection, in_flight=0, is_defunct=False, is_closed=False,
151-
max_request_id=100, signaled_error=False)
150+
max_request_id=100, signaled_error=False,
151+
orphaned_threshold_reached=False)
152152
session.cluster.connection_factory.return_value = conn
153153

154154
pool = self.PoolImpl(host, HostDistance.LOCAL, session)
155-
session.cluster.connection_factory.assert_called_once_with(host.endpoint)
155+
session.cluster.connection_factory.assert_called_once_with(host.endpoint, on_orphaned_stream_released=pool.on_orphaned_stream_released)
156156

157157
pool.borrow_connection(timeout=0.01)
158158
conn.is_defunct = True
@@ -169,11 +169,11 @@ def test_return_closed_connection(self):
169169
host = Mock(spec=Host, address='ip1')
170170
session = self.make_session()
171171
conn = NonCallableMagicMock(spec=Connection, in_flight=0, is_defunct=False, is_closed=True, max_request_id=100,
172-
signaled_error=False)
172+
signaled_error=False, orphaned_threshold_reached=False)
173173
session.cluster.connection_factory.return_value = conn
174174

175175
pool = self.PoolImpl(host, HostDistance.LOCAL, session)
176-
session.cluster.connection_factory.assert_called_once_with(host.endpoint)
176+
session.cluster.connection_factory.assert_called_once_with(host.endpoint, on_orphaned_stream_released=pool.on_orphaned_stream_released)
177177

178178
pool.borrow_connection(timeout=0.01)
179179
conn.is_closed = True

0 commit comments

Comments
 (0)