Skip to content

Conversation

@ulysses-you
Copy link
Contributor

@ulysses-you ulysses-you commented Dec 12, 2022

What changes were proposed in this pull request?

This pr fixs two issues when cache enable AQE

  1. Make AQE propagate the SQL metrics which is in InMemoryTableScan
    We do not propagate SQL metrics if the current execution id is not mapping to current query execution. The AdaptiveSparkPlanExec in InMemoryTableScan does not have its own query execution id. So we missed that SQL metrics.
    A simaple case is:
          ...
           |
  AdaptiveSparkPlanExec (queryexecution 0, no execution id)
           |
  InMemoryTableScanExec
           |
          ...
           |
  AdaptiveSparkPlanExec (queryexecution 1, execution id 0)
  1. Update final plan if contains InMemoryTableScan
    We only update final plan if contains subquery to avoid unnecessary SparkListenerSQLAdaptiveExecutionUpdate event, however, the AdaptiveSparkPlanExec missed updating if the final stage incude InMemoryTableScan.
    A simaple case is:
          ...
           |
  AdaptiveSparkPlanExec
           |
  InMemoryTableScanExec
           |
      ProjectExec
           |
  AdaptiveSparkPlanExec

Why are the changes needed?

Correct the plan and metrics if cache enable AQE. And make Spark UI work.

Does this PR introduce any user-facing change?

yes, after this pr, the Spark UI show the correct plan and metrics with AQE cache.

How was this patch tested?

add test and manually test

@github-actions github-actions bot added the SQL label Dec 12, 2022
@ulysses-you
Copy link
Contributor Author

@erenavsarogullari
Copy link
Member

Thanks @ulysses-you for this fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this test check metrics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added one more assert with SparkListenerSQLAdaptiveSQLMetricUpdates

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AdaptiveSparkPlan nodes are being injected for following use-cases:
1- Parent Query level as root node of SparkPlan,
2- AQE under InMemoryRelation,
3- SubQueries.
Does it makes sense to have UT also including both subQuery + AQE under IMR cases?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HashAggregateExec nodes metrics were coming as empty before:
https://issues.apache.org/jira/secure/attachment/13052914/DAG%20when%20AQE%3DON%20and%20AQECachedDFSupport%3DON%20without%20fix.png

Does it make sense to verify also HashAggregateExec metric(s) (coming before InMemoryRelation nodes) to support robustness? For example: HashAggregateExec - number of output rows does not change per test run.

@erenavsarogullari
Copy link
Member

Hi @ulysses-you and @cloud-fan,
Hope you are fine.
Do we plan to merge this patch? Thanks.

@ulysses-you
Copy link
Contributor Author

close this in favor of #39624

@ulysses-you ulysses-you deleted the aqe-cache2 branch March 13, 2023 05:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants