Skip to content

Conversation

@AngersZhuuuu
Copy link
Contributor

What changes were proposed in this pull request?

Cache Table with CTE won't work, there are two reasons

  1. In the current code CTE in CacheTableAsSelect will be inlined
  2. CTERelation Ref and Def didn't handle the CTEId doCanonicalize issue

Cause the current case can't be matched.

Why are the changes needed?

Fix bug

Does this PR introduce any user-facing change?

Yea, Cache table with CTE can work after this pr

How was this patch tested?

Added UT

Was this patch authored or co-authored using generative AI tooling?

No

@AngersZhuuuu
Copy link
Contributor Author

ping @cloud-fan

@github-actions github-actions bot added the SQL label Jan 17, 2024
@AngersZhuuuu AngersZhuuuu changed the title [SPARK-46741][SQL] Cache Table with CET won't work [SPARK-46741][SQL] Cache Table with CTE won't work Jan 17, 2024
@AngersZhuuuu
Copy link
Contributor Author

How about current? @cloud-fan

_.containsAnyPattern(CTE, PLAN_EXPRESSION)) {
case ref: CTERelationRef =>
ref.copy(cteId = defIndex(ref.cteId).toLong)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make sure there is no nested WithCTE?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make sure there is no nested WithCTE?

Form the CTESubtitution's code, and it's pr, there won't have nested WithCTE. #37751 cc @maryannxue

Copy link
Contributor

@cloud-fan cloud-fan Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we add assert to guarantee this assumption?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan Found nested WithCTE case, added to cte.sql and here change the method to support nested CTE, pls take a look again.

@AngersZhuuuu
Copy link
Contributor Author

ping @cloud-fan @yaooqinn

@AngersZhuuuu
Copy link
Contributor Author

@HyukjinKwon
Copy link
Member

what was behaviour before? Would be great to show the result before/after

@AngersZhuuuu
Copy link
Contributor Author

what was behaviour before? Would be great to show the result before/after

For the query in cache.sql

EXPLAIN EXTENDED SELECT * FROM cache_nested_cte_table

before this pr, cached table cache_nested_cte_table won't match, will execute again, after this pr, it can match the InMemoryRelation

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Oct 12, 2024
@github-actions github-actions bot closed this Oct 13, 2024
cloud-fan pushed a commit that referenced this pull request Dec 18, 2025
### What changes were proposed in this pull request?
Reopen #44767
Cache Table with CTE won't work, there are two reasons
  1. In the current code CTE in CacheTableAsSelect will be inlined
  2. CTERelation Ref and Def didn't handle the CTEId doCanonicalize issue
Cause the current case can't be matched.

### Why are the changes needed?
Fix Bug

### Does this PR introduce _any_ user-facing change?
Yea, Cache table with CTE can work after this pr

For added `cache.sql` final query
`EXPLAIN EXTENDED SELECT * FROM cache_nested_cte_table;`

Before this pr, the plan as below, cache won't work.
<img width="1067" height="584" alt="截屏2025-12-05 11 22 05" src="https://github.com/user-attachments/assets/045df794-38e2-47d9-848e-cfc3c7525671" />

After this pr
<img width="1279" height="824" alt="截屏2025-12-05 11 32 38" src="https://github.com/user-attachments/assets/86f5ab33-67c6-44d0-b5d8-4bec51a2d5b7" />

### How was this patch tested?
Added UT

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #53333 from AngersZhuuuu/SPARK-46741.

Authored-by: Angerszhuuuu <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
cloud-fan pushed a commit that referenced this pull request Dec 18, 2025
### What changes were proposed in this pull request?
Reopen #44767
Cache Table with CTE won't work, there are two reasons
  1. In the current code CTE in CacheTableAsSelect will be inlined
  2. CTERelation Ref and Def didn't handle the CTEId doCanonicalize issue
Cause the current case can't be matched.

### Why are the changes needed?
Fix Bug

### Does this PR introduce _any_ user-facing change?
Yea, Cache table with CTE can work after this pr

For added `cache.sql` final query
`EXPLAIN EXTENDED SELECT * FROM cache_nested_cte_table;`

Before this pr, the plan as below, cache won't work.
<img width="1067" height="584" alt="截屏2025-12-05 11 22 05" src="https://github.com/user-attachments/assets/045df794-38e2-47d9-848e-cfc3c7525671" />

After this pr
<img width="1279" height="824" alt="截屏2025-12-05 11 32 38" src="https://github.com/user-attachments/assets/86f5ab33-67c6-44d0-b5d8-4bec51a2d5b7" />

### How was this patch tested?
Added UT

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #53333 from AngersZhuuuu/SPARK-46741.

Authored-by: Angerszhuuuu <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit 8f69679)
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants