-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Speed up .nn
#15418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up .nn
#15418
Conversation
Some tests by Matt had cases where `.nn` figured prominently in the flamegraph. I optimized it so that the fast path is streamlined and inlined. In the following test code: ``` def x: String | Null = "abc" val y = x.nn ``` the new implementation is ``` 10: getstatic #20 // Field MODULE$:LTest$; 13: invokevirtual #24 // Method x:()Ljava/lang/String; 16: astore_0 17: getstatic #29 // Field scala/runtime/Scala3RunTime$.MODULE$:Lscala/runtime/Scala3RunTime$; 20: aload_0 21: invokevirtual #33 // Method scala/runtime/Scala3RunTime$.nn:(Ljava/lang/Object;)Ljava/lang/Object; 24: checkcast #35 // class java/lang/String 27: putstatic #37 // Field y:Ljava/lang/String; ``` The previous implementation was two bytes shorter, but contained in the critical path - A call to a ScalaRuntime method, which was not inlineable by the JIT compiler due to its size, - A conversion of a cmparison to a Boolean value (not sure this matters) - A cast from the return type `Object` to the actual type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
if x.asInstanceOf[Any] == null then scala.runtime.Scala3RunTime.nnFail() | ||
x.asInstanceOf[x.type & T] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you try
if x.asInstanceOf[Any] == null then scala.runtime.Scala3RunTime.nnFail() | |
x.asInstanceOf[x.type & T] | |
x.getClass() // throws NPE if x is null, no-op otherwise | |
x.asInstanceOf[x.type & T] |
I've seen javac emit dummy calls to getClass()
like this for the purpose of throwing NPEs. It might be even more efficient if the JVM recognizes the pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might lead to somewhat confusing error messages on recent JVMs where the NPE message tells you what operation caused the exception: https://openjdk.java.net/jeps/358 (would be interesting to see if modern javac still emits the dummy calls you saw)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's probably a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we find out
- what error message gets emitted for a
x.getClass
call ifx
is null? - whether javac in later versions still emits the
getClass
or whether it does something else?
It would be good to do the same thing as javac since that's probably the smallest and fastest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the JVM does some better optimization for java.util.Objects.requireNonNull
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe requireNonNull
is not the best we can do since it requires a useless cast after the call. We want to do the fastest possible non-null check (which should return Unit), followed by the cast to the target type.
... until we figured out what the best code sequence is.
Going to merge now. We can think about improving it later. |
Some tests by Matt had cases where
.nn
figured prominently in the flamegraph.I optimized it so that the fast path is streamlined and inlined.
In the following test code:
the old implementation was
and the new implementation is:
The old implementation was two bytes shorter, but contained in the critical path
Object
to the actual type.