-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-35266][TESTS] Fix error in BenchmarkBase.scala that occurs when creating benchmark files in non-existent directory #32394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…files in non-existent directory
| if (!dir.exists()) { | ||
| dir.mkdirs() | ||
| } | ||
| val file = new File(s"${dir}$resultFileName") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay. the new benchmark were added at SPARK-33882 and SPARK-35150, that was after #32015 and #32044.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got the point. Thank you!
| val file = new File(s"${prefix}benchmarks/$resultFileName") | ||
| val dir = new File(s"${prefix}benchmarks/") | ||
| if (!dir.exists()) { | ||
| dir.mkdirs() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add println and say the directory is going to be created? e.g.)
// scalastyle:off println
println(s"Creating ${dir.getAbsolutePath} for benchmark results.")
// scalastyle:on printlnThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that the benchmark directory is based on jars paths which are flaky. Might be better to explicitly show.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment :) I added println as suggested.
|
|
ok to test |
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good otherwise.
|
@srowen and @zhengruifeng FYI from 9244066 and 5b77ebb. I think it was perfectly fine without including benchmark results (but codes only) because It was a bit weird to upload the results based on different spec machines. Now there have been some latest changes at #32015 and #32044, and now the PR authors can run the benchmarks in similar specification very easily (https://spark.apache.org/developer-tools.html#github-workflow-benchmarks), and it makes more sense to include benchmark results in a PR :). |
|
ok to test |
@HyukjinKwon
|
|
Merged to master. |
|
Thanks for your first contribution and congrats for being a contributor! |
What changes were proposed in this pull request?
This PR fixes an error in
BenchmarkBase.scalathat occurs when creating a benchmark file in a non-existent directory.Why are the changes needed?
When submitting a benchmark job using
org.apache.spark.benchmark.Benchmarksclass withSPARK_GENERATE_BENCHMARK_FILES=1option, an exception is raised if the directory where the benchmark file will be generated does not exist.For more information, please refer to SPARK-35266.
Does this PR introduce any user-facing change?
No
How was this patch tested?
After building Spark, manually tested with the following command:
It successfully generated the benchmark result files.
Why it is sufficient:
As illustrated in the comments in
Benchmarks.scala, the command below runs all benchmarks and generates the results:Of all the benchmarks (55 benchmarks in total), only
BLASBenchmarkfails due to the proposed issue for the current code in the master branch. Thus, it is currently sufficient to testBLASBenchmarkto validate this change.