-
Notifications
You must be signed in to change notification settings - Fork 63
refactor: Remove never_skip_nulls param from window node def #2273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| THEN NULL | ||
| ELSE COALESCE(LOGICAL_AND(`bool_col`) OVER (PARTITION BY `string_col`), TRUE) | ||
| END AS `bfcol_2` | ||
| COALESCE(LOGICAL_AND(`bool_col`) OVER (PARTITION BY `string_col`), TRUE) AS `bfcol_2` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a breaking change? The coalesce means that it's never return NULL, only True, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its breaking in terms of expression semantics changing such that null guarding isn't part of window op definition. However, its not breaking in that behavior from user-facing api surfaces is preserved (these snapshot tests build trees at low-level so do change in behavior)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these low-level tests are in the "change detector test" category and provide more harm than value. https://testing.googleblog.com/2015/01/testing-on-toilet-change-detector-tests.html
Let's remove them. I think there is utility in golden SQL tests, but only when it reflects the end user APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value of this approach depends on how we define the end-user APIs and the change detector test. If we assume users only care about query results, the 'golden SQL' might seem unnecessary. However, user feedback indicates otherwise; users frequently inspect the generated SQL via job IDs. Consequently, query readiness and performance are significant priorities. The golden SQL demonstrates that window operation performance improves without additional NULL checking. Although the final results remain the same, this performance optimization is meaningful to the end user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a textbook example of a change detector test.
Trevor extracted logic out of an op and put it in a rewriter. Such changes shouldn't affect our golden SQL tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The golden SQL demonstrates that window operation performance improves without additional NULL checking
The semantics changed, not just the performance. These ops no longer ever return NULL. That is a big red flag for a refactor PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, there are no cases from public APIs to apply a window_spec to the AllOps. Hence, we are going to raise an error so that golden tests shouldn't test the unused codes. #2285
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> 🦕