[SPARK-34950][TESTS] Update benchmark results to the ones created by GitHub Actions machines #32044

HyukjinKwon · 2021-04-03T12:15:49Z

What changes were proposed in this pull request?

#32015 added a way to run benchmarks much more easily in the same GitHub Actions build. This PR updates the benchmark results by using the way.

NOTE that looks like GitHub Actions use four types of CPU given my observations:

Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz

Given my quick research, seems like they perform roughly similarly:

I couldn't find enough information about Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz but the performance seems roughly similar given the numbers.

So shouldn't be a big deal especially given that this way is much easier, encourages contributors to run more and guarantee the same number of cores and same memory with the same softwares.

Why are the changes needed?

To have a base line of the benchmarks accordingly.

Does this PR introduce any user-facing change?

No, dev-only.

How was this patch tested?

It was generated from:

SparkQA · 2021-04-03T13:39:46Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41459/

SparkQA · 2021-04-03T14:06:56Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41459/

SparkQA · 2021-04-03T15:11:55Z

Test build #136883 has finished for PR 32044 at commit 33f2ebe.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

MaxGekk · 2021-04-03T14:42:18Z

sql/core/benchmarks/WideSchemaBenchmark-results.txt

-2500 select expressions                             211            214           4          0.0   210927791.0       0.0X
+1 select expressions                                  1              2           0          0.0     1296117.0       1.0X
+100 select expressions                                9             11           1          0.0     8808690.0       0.1X
+2500 select expressions                             422            426           5          0.0   421632363.0       0.0X


regression by 2 times?

MaxGekk · 2021-04-03T14:58:47Z

sql/core/benchmarks/CSVBenchmark-results.txt

+Select 1000 columns                               96330          99161         NaN          0.0       96329.7       1.0X
+Select 100 columns                                41414          42672        1556          0.0       41414.1       2.3X
+Select one column                                 35365          36113         662          0.0       35365.4       2.7X
+count()                                           18845          18867          26          0.1       18845.0       5.1X


regression by 2 times

MaxGekk · 2021-04-03T20:01:55Z

+1, LGTM. The PR updates only benchmark results. The failed GA are not related to this PR. Merging to master.
Thank you @HyukjinKwon , and @wangyum @dongjoon-hyun for your reviews.

LuciferYang · 2021-10-27T09:13:22Z

@HyukjinKwon Can we use this way to generate the benchmarks results with Java 17?

On the other hand, I found some benchmarks do not have corresponding Java 11 result files, such as UpdateFieldsBenchmark and CharVarcharBenchmark, Is this expected?

LuciferYang · 2021-10-27T09:14:22Z

@HyukjinKwon Can we use this way to generate the benchmarks results with Java 17?

Let me study #32015 first. Should all new benchmarks results need generate in this way?

HyukjinKwon · 2021-10-27T10:30:23Z

Yes, they all should generate the files for JDK 11. If they don't, it's a bug.

Yes, we should have another set of these benchmark result files for JDK 17 separately

LuciferYang · 2021-10-27T11:15:27Z

Thank you for your explanation

Update benchmark results to the ones created by GitHub Actions machines

33f2ebe

github-actions bot added AVRO CORE SQL labels Apr 3, 2021

HyukjinKwon requested review from MaxGekk, dongjoon-hyun and wangyum April 3, 2021 12:16

wangyum approved these changes Apr 3, 2021

View reviewed changes

MaxGekk approved these changes Apr 3, 2021

View reviewed changes

dongjoon-hyun approved these changes Apr 3, 2021

View reviewed changes

MaxGekk closed this in ebf01ec Apr 3, 2021

HyukjinKwon mentioned this pull request May 3, 2021

[SPARK-35266][TESTS] Fix error in BenchmarkBase.scala that occurs when creating benchmark files in non-existent directory #32394

Closed

HyukjinKwon deleted the SPARK-34950 branch January 4, 2022 00:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-34950][TESTS] Update benchmark results to the ones created by GitHub Actions machines #32044

[SPARK-34950][TESTS] Update benchmark results to the ones created by GitHub Actions machines #32044

Uh oh!

HyukjinKwon commented Apr 3, 2021 •

edited

Loading

Uh oh!

SparkQA commented Apr 3, 2021

Uh oh!

SparkQA commented Apr 3, 2021

Uh oh!

SparkQA commented Apr 3, 2021

Uh oh!

MaxGekk Apr 3, 2021

Uh oh!

MaxGekk Apr 3, 2021

Uh oh!

MaxGekk commented Apr 3, 2021

Uh oh!

LuciferYang commented Oct 27, 2021

Uh oh!

LuciferYang commented Oct 27, 2021 •

edited

Loading

Uh oh!

HyukjinKwon commented Oct 27, 2021

Uh oh!

LuciferYang commented Oct 27, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[SPARK-34950][TESTS] Update benchmark results to the ones created by GitHub Actions machines #32044

[SPARK-34950][TESTS] Update benchmark results to the ones created by GitHub Actions machines #32044

Uh oh!

Conversation

HyukjinKwon commented Apr 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

SparkQA commented Apr 3, 2021

Uh oh!

SparkQA commented Apr 3, 2021

Uh oh!

SparkQA commented Apr 3, 2021

Uh oh!

MaxGekk Apr 3, 2021

Choose a reason for hiding this comment

Uh oh!

MaxGekk Apr 3, 2021

Choose a reason for hiding this comment

Uh oh!

MaxGekk commented Apr 3, 2021

Uh oh!

LuciferYang commented Oct 27, 2021

Uh oh!

LuciferYang commented Oct 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HyukjinKwon commented Oct 27, 2021

Uh oh!

LuciferYang commented Oct 27, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

HyukjinKwon commented Apr 3, 2021 •

edited

Loading

LuciferYang commented Oct 27, 2021 •

edited

Loading