Skip to content

ResultSet: handle empty non-final pages on ResultSet iteration #1110

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 17, 2021
Merged

ResultSet: handle empty non-final pages on ResultSet iteration #1110

merged 1 commit into from
Sep 17, 2021

Conversation

ultrabug
Copy link
Contributor

This commit provides a fix to the situation when iterating on a
ResultSet, the driver aborts the iteration if the server returns
an empty page even if there are next pages available.

Python driver is affected by the same problem as JAVA-2934
This fix is similar to apache/cassandra-java-driver#1544

This commit provides a fix to the situation when iterating on a
ResultSet, the driver aborts the iteration if the server returns
an empty page even if there are next pages available.

Python driver is affected by the same problem as JAVA-2934
This fix is similar to apache/cassandra-java-driver#1544
@ultrabug
Copy link
Contributor Author

I just added tests to the commit to demonstrate the problem and the fix.

rs = ResultSet(response_future, [])
itr = iter(rs)
self.assertListEqual(list(itr), expected)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice test case... a very effective demonstration of the problem!

@@ -5141,6 +5141,7 @@ def next(self):
if not self.response_future._continuous_paging_session:
self.fetch_next_page()
self._page_iter = iter(self._current_rows)
return self.next()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just a bit worried that this complicates next() a bit more than is necessary. The intermingling of self.next() and the global next(iter) functions here aren't as clear as one might like. Since our main goal is to ensure that we get a non-empty page out of fetch_next_page() why not test that explicitly?

        if not self.response_future._continuous_paging_session:
            # Some servers may return empty pages (Scylla is known to do so
            # in at least some cases) so make sure we have something non-empty
            # before we return.
            while True:
                self.fetch_next_page()
                if self._current_rows:
                    break
            self._page_iter = iter(self._current_rows)

Your code is effectively doing the same thing, it's just doing it by recursively calling self.next() to do so. I'm wondering if we can avoid the recursion entirely (and be clearer about our intent) by just handling the empty page case here.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @absurdfarce

IMHO the right place to handle this is right where I've put it in the code logic because it makes it clearer that "in that case", we need to "replay the same logic of handling the iteration on pages".

As such, I feel the recursion is clean and less confusing than trying to be smart in a function which name has nothing to do with what needs to be done in the situation that we're covering here.

Once again, that's just my opinion of course and as you can see, that one line is all it takes ;)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay in getting back to you @ultrabug. I'm trying to juggle a number of things at once, ideally without dropping any of them on the floor. :)

I take your point about the minimal number of changes required to support your fix. I'm not 💯 sure I agree with doing it this way but I can certainly see the benefit of your approach. How about something of a compromise: perhaps you could add a simple comment in there noting that (a) empty pages are possible in some impls and (b) if the page we just fetched happens to be empty we'll do the right thing when we recurse? I think if that were there it would go a long way towards addressing my concern about clarity of the code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ultrabug After thinking about this some more there's no reason to hold up accepting this PR for docs. What you have here is a good change and I certainly appreciate your work so far (including the aforementioned nice test!). I'll merge this in now and sort out the documentation question later.

@absurdfarce
Copy link
Collaborator

@ultrabug Thanks for the pull request! I had a question on the PR itself but I like what you have here, especially the nice unit test to clearly demonstrate the problem.

Have you signed the Contributor License Agreement for contributions to DataStax open source projects? If not you can find it at https://cla.datastax.com. Thanks!

@ultrabug
Copy link
Contributor Author

ultrabug commented Sep 1, 2021

@aboudreault

Have you signed the Contributor License Agreement for contributions to DataStax open source projects

I just did, yes. Thank you!

@absurdfarce absurdfarce merged commit 1d9077d into datastax:master Sep 17, 2021
absurdfarce added a commit that referenced this pull request Sep 17, 2021
@ultrabug
Copy link
Contributor Author

Thank you @absurdfarce , I see you added the comment as well 👍 sorry for the late reply

fruch added a commit to fruch/python-driver that referenced this pull request Aug 2, 2022
…to sync_with_upstream

* 'master' of https://github.com/datastax/python-driver:
  Merge pull request datastax#1126 from eamanu/fix-typos
  PYTHON-1294: Upgrade importlib-metadata to a much newer version
  Add tests for recent addition of execution profile support to cassandra.concurrent
  Merge pull request datastax#1122 from andy-slac/concurrent-execution-profiles
  Merge pull request datastax#1119 from datastax/python-1290
  Merge pull request datastax#1117 from datastax/remove_unittest2
  Removing file unexpectedly included in previous PR
  Merge pull request datastax#1114 from haaawk/stream_ids_fix
  Merge pull request datastax#1116 from Orenef11/fix_default_argument_value
  Comment update following off of datastax#1110
  Merge pull request datastax#1103 from numberly/fix_empty_paging
  Merge pull request datastax#1103 from psarna/fix_deprecation_in_tracing
  Fixes to the Travis build. (datastax#1111)
fruch added a commit to fruch/python-driver that referenced this pull request Aug 2, 2022
…to sync_with_upstream_2

* 'master' of https://github.com/datastax/python-driver:
  Merge pull request datastax#1126 from eamanu/fix-typos
  PYTHON-1294: Upgrade importlib-metadata to a much newer version
  Add tests for recent addition of execution profile support to cassandra.concurrent
  Merge pull request datastax#1122 from andy-slac/concurrent-execution-profiles
  Merge pull request datastax#1119 from datastax/python-1290
  Merge pull request datastax#1117 from datastax/remove_unittest2
  Removing file unexpectedly included in previous PR
  Merge pull request datastax#1114 from haaawk/stream_ids_fix
  Merge pull request datastax#1116 from Orenef11/fix_default_argument_value
  Comment update following off of datastax#1110
  Merge pull request datastax#1103 from numberly/fix_empty_paging
  Merge pull request datastax#1103 from psarna/fix_deprecation_in_tracing
  Fixes to the Travis build. (datastax#1111)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants