ResultSet: handle empty non-final pages on ResultSet iteration #1110

ultrabug · 2021-08-30T10:06:31Z

This commit provides a fix to the situation when iterating on a
ResultSet, the driver aborts the iteration if the server returns
an empty page even if there are next pages available.

Python driver is affected by the same problem as JAVA-2934
This fix is similar to apache/cassandra-java-driver#1544

This commit provides a fix to the situation when iterating on a ResultSet, the driver aborts the iteration if the server returns an empty page even if there are next pages available. Python driver is affected by the same problem as JAVA-2934 This fix is similar to apache/cassandra-java-driver#1544

ultrabug · 2021-08-30T16:24:50Z

I just added tests to the commit to demonstrate the problem and the fix.

absurdfarce · 2021-09-01T16:24:52Z

tests/unit/test_resultset.py

+        rs = ResultSet(response_future, [])
+        itr = iter(rs)
+        self.assertListEqual(list(itr), expected)
+


Nice test case... a very effective demonstration of the problem!

absurdfarce · 2021-09-01T16:30:13Z

cassandra/cluster.py

@@ -5141,6 +5141,7 @@ def next(self):
        if not self.response_future._continuous_paging_session:
            self.fetch_next_page()
            self._page_iter = iter(self._current_rows)
+            return self.next()


I'm just a bit worried that this complicates next() a bit more than is necessary. The intermingling of self.next() and the global next(iter) functions here aren't as clear as one might like. Since our main goal is to ensure that we get a non-empty page out of fetch_next_page() why not test that explicitly?

if not self.response_future._continuous_paging_session: # Some servers may return empty pages (Scylla is known to do so # in at least some cases) so make sure we have something non-empty # before we return. while True: self.fetch_next_page() if self._current_rows: break self._page_iter = iter(self._current_rows)

Your code is effectively doing the same thing, it's just doing it by recursively calling self.next() to do so. I'm wondering if we can avoid the recursion entirely (and be clearer about our intent) by just handling the empty page case here.

What do you think?

Hi @absurdfarce

IMHO the right place to handle this is right where I've put it in the code logic because it makes it clearer that "in that case", we need to "replay the same logic of handling the iteration on pages".

As such, I feel the recursion is clean and less confusing than trying to be smart in a function which name has nothing to do with what needs to be done in the situation that we're covering here.

Once again, that's just my opinion of course and as you can see, that one line is all it takes ;)

Apologies for the delay in getting back to you @ultrabug. I'm trying to juggle a number of things at once, ideally without dropping any of them on the floor. :)

I take your point about the minimal number of changes required to support your fix. I'm not 💯 sure I agree with doing it this way but I can certainly see the benefit of your approach. How about something of a compromise: perhaps you could add a simple comment in there noting that (a) empty pages are possible in some impls and (b) if the page we just fetched happens to be empty we'll do the right thing when we recurse? I think if that were there it would go a long way towards addressing my concern about clarity of the code.

@ultrabug After thinking about this some more there's no reason to hold up accepting this PR for docs. What you have here is a good change and I certainly appreciate your work so far (including the aforementioned nice test!). I'll merge this in now and sort out the documentation question later.

absurdfarce · 2021-09-01T16:31:46Z

@ultrabug Thanks for the pull request! I had a question on the PR itself but I like what you have here, especially the nice unit test to clearly demonstrate the problem.

Have you signed the Contributor License Agreement for contributions to DataStax open source projects? If not you can find it at https://cla.datastax.com. Thanks!

ultrabug · 2021-09-01T20:38:23Z

@aboudreault

Have you signed the Contributor License Agreement for contributions to DataStax open source projects

I just did, yes. Thank you!

ultrabug · 2021-09-18T10:01:08Z

Thank you @absurdfarce , I see you added the comment as well 👍 sorry for the late reply

…to sync_with_upstream * 'master' of https://github.com/datastax/python-driver: Merge pull request datastax#1126 from eamanu/fix-typos PYTHON-1294: Upgrade importlib-metadata to a much newer version Add tests for recent addition of execution profile support to cassandra.concurrent Merge pull request datastax#1122 from andy-slac/concurrent-execution-profiles Merge pull request datastax#1119 from datastax/python-1290 Merge pull request datastax#1117 from datastax/remove_unittest2 Removing file unexpectedly included in previous PR Merge pull request datastax#1114 from haaawk/stream_ids_fix Merge pull request datastax#1116 from Orenef11/fix_default_argument_value Comment update following off of datastax#1110 Merge pull request datastax#1103 from numberly/fix_empty_paging Merge pull request datastax#1103 from psarna/fix_deprecation_in_tracing Fixes to the Travis build. (datastax#1111)

…to sync_with_upstream_2 * 'master' of https://github.com/datastax/python-driver: Merge pull request datastax#1126 from eamanu/fix-typos PYTHON-1294: Upgrade importlib-metadata to a much newer version Add tests for recent addition of execution profile support to cassandra.concurrent Merge pull request datastax#1122 from andy-slac/concurrent-execution-profiles Merge pull request datastax#1119 from datastax/python-1290 Merge pull request datastax#1117 from datastax/remove_unittest2 Removing file unexpectedly included in previous PR Merge pull request datastax#1114 from haaawk/stream_ids_fix Merge pull request datastax#1116 from Orenef11/fix_default_argument_value Comment update following off of datastax#1110 Merge pull request datastax#1103 from numberly/fix_empty_paging Merge pull request datastax#1103 from psarna/fix_deprecation_in_tracing Fixes to the Travis build. (datastax#1111)

absurdfarce reviewed Sep 1, 2021

View reviewed changes

absurdfarce merged commit 1d9077d into datastax:master Sep 17, 2021

absurdfarce added a commit that referenced this pull request Sep 17, 2021

Comment update following off of #1110

12a8adc

ultrabug mentioned this pull request Oct 12, 2021

If ALLOW FILTERING excludes a long string of rows, the scan can stop prematurely scylladb/scylladb#8203

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ResultSet: handle empty non-final pages on ResultSet iteration #1110

ResultSet: handle empty non-final pages on ResultSet iteration #1110

Uh oh!

ultrabug commented Aug 30, 2021

Uh oh!

ultrabug commented Aug 30, 2021

Uh oh!

absurdfarce Sep 1, 2021

Uh oh!

absurdfarce Sep 1, 2021

Uh oh!

ultrabug Sep 1, 2021

Uh oh!

absurdfarce Sep 15, 2021

Uh oh!

absurdfarce Sep 17, 2021

Uh oh!

absurdfarce commented Sep 1, 2021

Uh oh!

ultrabug commented Sep 1, 2021

Uh oh!

ultrabug commented Sep 18, 2021

Uh oh!

Uh oh!

ResultSet: handle empty non-final pages on ResultSet iteration #1110

ResultSet: handle empty non-final pages on ResultSet iteration #1110

Uh oh!

Conversation

ultrabug commented Aug 30, 2021

Uh oh!

ultrabug commented Aug 30, 2021

Uh oh!

absurdfarce Sep 1, 2021

Choose a reason for hiding this comment

Uh oh!

absurdfarce Sep 1, 2021

Choose a reason for hiding this comment

Uh oh!

ultrabug Sep 1, 2021

Choose a reason for hiding this comment

Uh oh!

absurdfarce Sep 15, 2021

Choose a reason for hiding this comment

Uh oh!

absurdfarce Sep 17, 2021

Choose a reason for hiding this comment

Uh oh!

absurdfarce commented Sep 1, 2021

Uh oh!

ultrabug commented Sep 1, 2021

Uh oh!

ultrabug commented Sep 18, 2021

Uh oh!

Uh oh!