Skip to content

Conversation

@ulysses-you
Copy link
Contributor

backport #37250 into branch-3.1

What changes were proposed in this pull request?

Correct the EliminateSorts follows:

  • If the upper sort is global then we can remove the global or local sort recursively.
  • If the upper sort is local then we can only remove the local sort recursively.

Why are the changes needed?

If a global sort below locol sort, we should not remove the global sort becuase the output partitioning can be affected.

This issue is going to worse since we pull out the V1 Write sort to logcial side.

Does this PR introduce any user-facing change?

yes, bug fix

How was this patch tested?

add test

…cal sort

 Correct the `EliminateSorts` follows:

- If the upper sort is global then we can remove the global or local sort recursively.
- If the upper sort is local then we can only remove the local sort recursively.

If a global sort below locol sort, we should not remove the global sort becuase the output partitioning can be affected.

This issue is going to worse since we pull out the V1 Write sort to logcial side.

yes, bug fix

add test

Closes apache#37250 from ulysses-you/remove-sort.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
@ulysses-you
Copy link
Contributor Author

cc @cloud-fan ready for branch-3.1

@cloud-fan
Copy link
Contributor

to avoid any perf regression, can we include the global sort -> repartition optimization?

### What changes were proposed in this pull request?

Optimize Global sort to RepartitionByExpression, for example:
```
Sort local             Sort local
  Sort global    =>      RepartitionByExpression
```

### Why are the changes needed?

If a global sort below a local sort, the only meaningful thing is it's distribution. So this pr optimizes that global sort to RepartitionByExpression to save a local sort.

### Does this PR introduce _any_ user-facing change?

no, only improve performance

### How was this patch tested?

add test

Closes apache#37330 from ulysses-you/optimize-sort.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
@ulysses-you
Copy link
Contributor Author

@cloud-fan done

@cloud-fan
Copy link
Contributor

cloud-fan commented Aug 3, 2022

pycodestyle checks failed:

This PR does not change any python code, thus the failure is unrelated. Thanks, merging to 3.1!

cloud-fan pushed a commit that referenced this pull request Aug 3, 2022
…he local sort

backport #37250 into branch-3.1

### What changes were proposed in this pull request?

 Correct the `EliminateSorts` follows:

- If the upper sort is global then we can remove the global or local sort recursively.
- If the upper sort is local then we can only remove the local sort recursively.

### Why are the changes needed?

If a global sort below locol sort, we should not remove the global sort becuase the output partitioning can be affected.

This issue is going to worse since we pull out the V1 Write sort to logcial side.

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

add test

Closes #37276 from ulysses-you/SPARK-39835-3.1.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
@cloud-fan cloud-fan closed this Aug 3, 2022
@ulysses-you ulysses-you deleted the SPARK-39835-3.1 branch August 3, 2022 03:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants