-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-45830][CORE] Refactor StorageUtils#bufferCleaner
#43675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
StorageUtils#bufferCleaner
to avoid directly using classes under the sun
packageStorageUtils#bufferCleaner
to avoid directly using classes under the sun
package
Test first |
buffer: DirectBuffer => cleanerMethod.invoke(unsafe, buffer) | ||
private val bufferCleaner: ByteBuffer => Unit = { | ||
val cleanerClass = Utils.classForName("jdk.internal.ref.Cleaner") | ||
val directBufferClass = Utils.classForName("sun.nio.ch.DirectBuffer") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this class can also be java.nio.DirectByteBuffer
due to MappedByteBuffer
has only two subclasses: java.nio.DirectByteBuffer
and java.nio.DirectByteBufferR
, and java.nio.DirectByteBufferR
also inherits from java.nio.DirectByteBuffer
. which one is better?
StorageUtils#bufferCleaner
to avoid directly using classes under the sun
packageStorageUtils#bufferCleaner
StorageUtils#bufferCleaner
StorageUtils#bufferCleaner
to avoid directly using classes under the sun
package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think there is a chance of performance difference, @LuciferYang ?
Although the official recommendation is for the former rather than reflection, I have previously done some mircabenchmarks on Java 8 and there is no significant performance difference between the two. If necessary, I can write some more cases to compare them in Java 17. |
convert to draft first, let me check the performance |
MethodHandles.privateLookupIn(cleanerClass, MethodHandles.lookup()) | ||
val cleanMethod: MethodHandle = | ||
cleanerLookup.findVirtual(cleanerClass, "clean", MethodType.methodType(classOf[Unit])) | ||
buffer: ByteBuffer => cleanMethod.invoke(cleanerMethod.invoke(buffer)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new method is indeed slower than the old method because there are two calls to the method handle. Let me think about what to do. @dongjoon-hyun
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me test several cpu models more and feedback later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote a micro benchmark to test the initialization and invocation of bufferCleaner, observing their performance under different CPU models, including AMD EPYC 7763
, E5-2673
, 8171M
, and E5-2673
. The test data reflects the following facts:
-
Using the methodhandle method to initialize bufferCleaner is 60%~70% slower than base implementation
-
The performance of using the methodhandle to call is basically the same as that of base implementation, with a single call delay difference of ~1ns
-
The current pr implementation has a performance advantage of more than 10 times over base implementation in initializing bufferCleaner
-
The performance of the current pr implementation call is at least 30% faster than that of base implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, sun.misc.Unsafe
is exported, so it can still be used directly now
StorageUtils#bufferCleaner
to avoid directly using classes under the sun
packageStorageUtils#bufferCleaner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @LuciferYang .
Merged to master for Apache Spark 4.0.0.
BTW, cc @rednaxelafx , too. |
Thanks @dongjoon-hyun ~ |
What changes were proposed in this pull request?
This pr refactor
StorageUtils#bufferCleaner
as follows:bufferCleaner
fromDirectBuffer => Unit
toByteBuffer => Unit
unsafe.invokeCleaner
instead of reflecting callsWhy are the changes needed?
-release
instead of the-target
for compilation. However, due tosun.nio.ch
module was not exported, this can lead to the issue of class invisibility during Java cross compilation, such as building or testing using Java 21 with-release:17
After this pr, the following compilation errors will not occur again when build core module using Java 21 with-release:17
:unsafe.invokeCleaner
provides better performance, compared to reflection calls, it is at least 30% fasterDoes this PR introduce any user-facing change?
No
How was this patch tested?
-release:17
, no longer compilation failure logs aboveNote: There is still an issue with other classes being invisible, which needs to be fixed in follow up
Was this patch authored or co-authored using generative AI tooling?
No