-
Notifications
You must be signed in to change notification settings - Fork 25.6k
ESQL: Rework isNull
#118101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: Rework isNull
#118101
Conversation
This reworks `Expressions#isNull` so it only matches the `null` literal. This doesn't super change the way ESQL works because we already rewrite things that `fold` into `null` into the `null` literal. It's just that, now, `isNull` won't return `true` for things that *fold* to null - only things that have *already* folded to null. This is important because `fold` can be quite expensive so we're better off keeping the results of it when possible. Which is what the constant folding rules *do*.
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
So my statement:
Is how I believe we work. But I think folks who own various bits of code should double check me. |
| // P.S. this could be done inside the Aggregate but this place better centralizes the logic | ||
| if (e instanceof AggregateFunction agg) { | ||
| if (Expressions.isNull(agg.filter())) { | ||
| if (Expressions.isGuaranteedNull(agg.filter())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure the changes to this file are fine because, if we don't fold to null immediately, we will fold constants, which will yield literal null, which will then fold null here.
| @Override | ||
| public Object fold() { | ||
| if (Expressions.isNull(value) || list.stream().allMatch(Expressions::isNull)) { | ||
| if (Expressions.isGuaranteedNull(value) || list.stream().allMatch(Expressions::isGuaranteedNull)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious about this one. I feel like it's tricky to be sure this is right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to try removing this line from ESQL entirely to see what happens.... Learning!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one test fails if I zap this - testFoldNullListInToLocalRelation - which I don't really have the background to know. I don't think we should remove this, but I want to make sure this is still running properly in this way.
| return filter.child(); | ||
| } | ||
| if (FALSE.equals(condition) || Expressions.isNull(condition)) { | ||
| if (FALSE.equals(condition) || Expressions.isGuaranteedNull(condition)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure these are fine because we'll have rewritten all null valued expressions into literal nulls by the time we make it here. But I'd love is someone could confirm that we have a test for the plan bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And that someone could be me....
|
I am a bit surprised that the PR passes :-))). I'll have a deeper look, though. There are many usecases that we do not sort-of support now that are related to aggregations - see #112392. That PR is a daring attempt at fixing the missing bits, but aggregations folding (as a concept) needs the happen otherwise that PR introduces way too many workarounds just to make some use cases work. I will close it and create a sort of meta issue for constant folding instead. Leaving that PR aside, the change here with |
Yeah! I hate that. But I think the alternative is throwing away a bunch of work after folding. Sometimes allocating a lot of stuff. |
costin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
astefan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've given some more thought to this. Conceptually, isNull was doing too much I think by actually temporarily folding the value and checking it. Whereas the operation of the expression being folded (and actually being replaced as null in the entire tree) was happening elsewhere (Literal,of(e) via ConstantFolding or right here in FoldNull).
bpintea
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering what prompted the change. Repeatedly folding (and throwing away the results) is wasteful, but not that many rules get to run before isNull that do fold.
But at this scale, maybe placing ConstantFolding (and maybe PartiallyFoldCase) ahead of FoldNull in the optimiser would also make sense -- maybe push these three to be the first rules?
| * into a {@link Literal} containing {@code null} which will return | ||
| * {@code true} from here. | ||
| */ | ||
| public static boolean isGuaranteedNull(Expression e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: isGuaranteedNull makes me wonder what is "null" and more relevant when is it not guaranteed :)
I'd find something like isNullTypeOrLiteral clearer, but just a nitty nit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. I think I'm going to keep it because everything feels not so great. Maybe valueIsGuaranteedNull or something? I'm not sure that makes it more descriptive.
|
Seeing approvals, I'm going to merge on this so I can build on it. If anyone can come up with any fun extra test cases, let's make followup. And
I think that's a great idea! Should I just try it myself? |
💚 Backport successful
|
This reworks `Expressions#isNull` so it only matches the `null` literal. This doesn't super change the way ESQL works because we already rewrite things that `fold` into `null` into the `null` literal. It's just that, now, `isNull` won't return `true` for things that *fold* to null - only things that have *already* folded to null. This is important because `fold` can be quite expensive so we're better off keeping the results of it when possible. Which is what the constant folding rules *do*.
This reworks `Expressions#isNull` so it only matches the `null` literal. This doesn't super change the way ESQL works because we already rewrite things that `fold` into `null` into the `null` literal. It's just that, now, `isNull` won't return `true` for things that *fold* to null - only things that have *already* folded to null. This is important because `fold` can be quite expensive so we're better off keeping the results of it when possible. Which is what the constant folding rules *do*.
This reworks
Expressions#isNullso it only matches thenullliteral. This doesn't super change the way ESQL works because we already rewrite things thatfoldintonullinto thenullliteral. It's just that, now,isNullwon't returntruefor things that fold to null - only things that have already folded to null.This is important because
foldcan be quite expensive so we're betteroff keeping the results of it when possible. Which is what the constant
folding rules do.