-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Open
Labels
discussionProject DiscussionProject Discussion
Description
Describe the bug
Referenced discussion: https://the-asf.slack.com/archives/C01QUFS30TD/p1745875862723149
Given this table:
> create table d1 (ul_node_id string);
It appears that sorts are being removed from inner expressions. For example, here is a query with a sort expression:
> explain SELECT "ul_node_id", encode(ul_node_id, 'base64') FROM d1 ORDER BY encode(ul_node_id, 'base64') ASC NULLS LAST;
+---------------+-------------------------------+
| plan_type | plan |
+---------------+-------------------------------+
| physical_plan | ┌───────────────────────────┐ |
| | │ SortExec │ |
| | │ -------------------- │ |
| | │ encode(d1.ul_node_id,Utf8(│ |
| | │ "base64"))@1 ASC NULLS │ |
| | │ LAST │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ ProjectionExec │ |
| | │ -------------------- │ |
| | │ encode(d1.ul_node_id,Utf8(│ |
| | │ "base64")): │ |
| | │ encode(ul_node_id, base64)│ |
| | │ │ |
| | │ ul_node_id: │ |
| | │ ul_node_id │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ DataSourceExec │ |
| | │ -------------------- │ |
| | │ bytes: 0 │ |
| | │ format: memory │ |
| | │ rows: 0 │ |
| | └───────────────────────────┘ |
| | |
+---------------+-------------------------------+
1 row(s) fetched.
If I wrap it in a no-op CTE the SortExec
stage disappears:
> explain WITH t0 AS (SELECT "ul_node_id", encode(ul_node_id, 'base64') FROM d1 ORDER BY encode(ul_node_id, 'base64') ASC NULLS LAST) SELECT * FROM t0;
+---------------+-------------------------------+
| plan_type | plan |
+---------------+-------------------------------+
| physical_plan | ┌───────────────────────────┐ |
| | │ ProjectionExec │ |
| | │ -------------------- │ |
| | │ encode(d1.ul_node_id,Utf8(│ |
| | │ "base64")): │ |
| | │ encode(ul_node_id, base64)│ |
| | │ │ |
| | │ ul_node_id: │ |
| | │ ul_node_id │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ DataSourceExec │ |
| | │ -------------------- │ |
| | │ bytes: 0 │ |
| | │ format: memory │ |
| | │ rows: 0 │ |
| | └───────────────────────────┘ |
| | |
+---------------+-------------------------------+
1 row(s) fetched.
Manually hoisting out the ORDER BY from the CTE causes it to appear in the plan:
> explain WITH t0 AS (SELECT "ul_node_id", encode(ul_node_id, 'base64') FROM d1) SELECT * from t0 ORDER BY encode(ul_node_id, 'base64') ASC NULLS LAST;
+---------------+-------------------------------+
| plan_type | plan |
+---------------+-------------------------------+
| physical_plan | ┌───────────────────────────┐ |
| | │ SortExec │ |
| | │ -------------------- │ |
| | │ encode(ul_node_id@0, │ |
| | │ base64) ASC NULLS │ |
| | │ LAST │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ ProjectionExec │ |
| | │ -------------------- │ |
| | │ encode(d1.ul_node_id,Utf8(│ |
| | │ "base64")): │ |
| | │ encode(ul_node_id, base64)│ |
| | │ │ |
| | │ ul_node_id: │ |
| | │ ul_node_id │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ DataSourceExec │ |
| | │ -------------------- │ |
| | │ bytes: 0 │ |
| | │ format: memory │ |
| | │ rows: 0 │ |
| | └───────────────────────────┘ |
| | |
+---------------+-------------------------------+
1 row(s) fetched.
To Reproduce
No response
Expected behavior
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
discussionProject DiscussionProject Discussion