Skip to content

Conversation

@TrevorBergeron
Copy link
Contributor

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Nov 15, 2025
@TrevorBergeron TrevorBergeron marked this pull request as ready for review November 15, 2025 01:23
@TrevorBergeron TrevorBergeron requested review from a team as code owners November 15, 2025 01:23
@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels Nov 17, 2025
@product-auto-label product-auto-label bot added size: m Pull request size is medium. and removed size: l Pull request size is large. labels Nov 18, 2025
THEN NULL
ELSE COALESCE(LOGICAL_AND(`bool_col`) OVER (PARTITION BY `string_col`), TRUE)
END AS `bfcol_2`
COALESCE(LOGICAL_AND(`bool_col`) OVER (PARTITION BY `string_col`), TRUE) AS `bfcol_2`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a breaking change? The coalesce means that it's never return NULL, only True, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its breaking in terms of expression semantics changing such that null guarding isn't part of window op definition. However, its not breaking in that behavior from user-facing api surfaces is preserved (these snapshot tests build trees at low-level so do change in behavior)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these low-level tests are in the "change detector test" category and provide more harm than value. https://testing.googleblog.com/2015/01/testing-on-toilet-change-detector-tests.html

Let's remove them. I think there is utility in golden SQL tests, but only when it reflects the end user APIs.

CC @chelsea-lin @jialuoo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value of this approach depends on how we define the end-user APIs and the change detector test. If we assume users only care about query results, the 'golden SQL' might seem unnecessary. However, user feedback indicates otherwise; users frequently inspect the generated SQL via job IDs. Consequently, query readiness and performance are significant priorities. The golden SQL demonstrates that window operation performance improves without additional NULL checking. Although the final results remain the same, this performance optimization is meaningful to the end user.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a textbook example of a change detector test.

Trevor extracted logic out of an op and put it in a rewriter. Such changes shouldn't affect our golden SQL tests.

Copy link
Collaborator

@tswast tswast Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The golden SQL demonstrates that window operation performance improves without additional NULL checking

The semantics changed, not just the performance. These ops no longer ever return NULL. That is a big red flag for a refactor PR.

Copy link
Contributor

@chelsea-lin chelsea-lin Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, there are no cases from public APIs to apply a window_spec to the AllOps. Hence, we are going to raise an error so that golden tests shouldn't test the unused codes. #2285

@TrevorBergeron TrevorBergeron merged commit 6e73d77 into main Nov 19, 2025
22 of 24 checks passed
@TrevorBergeron TrevorBergeron deleted the window_null_skip_refactor branch November 19, 2025 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: m Pull request size is medium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants