Skip to content

not(isNull x) leads to very odd and partially unreachable IL code that performace 5x slower than redefining not yourself #9433

Closed
@abelbraaksma

Description

@abelbraaksma

I noticed this while doing timings for #9390, where sometimes using not gave an unexpected performance degradation. This boiled down to not sometimes leading to very unexpected IL.

Repro steps

Take the following code snippet:

let useNotIsNull (str:string) =
    if not(isNull str) then str.Length
    else 0

Because not is coded to return a single IL instruction with ceq, and isNull, while coded with match, also leads to basically a single ceq instruction, that we'd end up with two instructions, or, after optimization, a single one. However, it blows up:

    IL_0000: ldarg.0
    IL_0001: brfalse.s IL_0006

    IL_0003: ldc.i4.0
    IL_0004: br.s IL_0007

    IL_0006: ldc.i4.1

    IL_0007: brtrue.s IL_0010

    IL_0009: ldarg.0
    IL_000a: callvirt instance int32 [System.Private.CoreLib]System.String::get_Length()
    IL_000f: ret

    IL_0010: ldc.i4.0
    IL_0011: ret

Which gets translated in C# as:

public static int useNotIsNull(string str)
{
    if (str != null || 1 == 0)
    {
        return str.Length;
    }
    return 0;
}

If you were to recreate the not function as follows:

let not x = match x with true -> false | _ -> true

The same code above would now be encoded in IL as:

    IL_0000: ldarg.0
    IL_0001: brfalse.s IL_000a

    IL_0003: ldarg.0
    IL_0004: callvirt instance int32 [System.Private.CoreLib]System.String::get_Length()
    IL_0009: ret

    IL_000a: ldc.i4.0
    IL_000b: ret

And here is the real killer, if we encode not as itself, the problem also disappears, regardless of whether it is marked as inline (the original) or not:

let justLikeNot x = not x

let useJustLikeNot (str:string) =
    if justLikeNot(isNull str) then str.Length
    else 0

Resulting IL:

    IL_0000: ldarg.0
    IL_0001: brfalse.s IL_000a

    IL_0003: ldarg.0
    IL_0004: callvirt instance int32 [System.Private.CoreLib]System.String::get_Length()
    IL_0009: ret

    IL_000a: ldc.i4.0
    IL_000b: ret

Strangely, the not function itself looks exactly the same as the justLikeNot function above:

    IL_0000: ldarg.0
    IL_0001: ldc.i4.0
    IL_0002: ceq
    IL_0004: ret

Though in one case (with isNull) it leads to strange opcodes. In most other cases, it leads to the expected folding of the ceq into a brfalse or brtrue respectively.

More examples of coding this and their surprising translations can be found in this SharpLab.io snippet.

Expected behavior

Actual behavior

See above for the actual behavior. In terms of performance, the different not versions in the code perform all as expected, since they are ultimately folded into optimized x64 assembly, except for the not(isNull x) version. The notIsNull below uses not(isNull x), the others all use a different way of coding not than the default:

image

(These timings were made by ensuring the function returns and is not optimized away (hence the str.Length call) and repeated 10_000x in a close for-loop to erase timing inefficiencies for micro-benchmarks with BDN.)

This is ultimatedly caused by the final assembly, which looks as follows (note the popping and extra call):

; FSharp.Perf.BenchLength.notIsNull()
       push      rdi
       push      rsi
       sub       rsp,28
       mov       ecx,[rcx+8]
       call      FSharp.Perf.Data.get(Int32)
       mov       rsi,rax
       xor       edi,edi
M00_L00:                      ; start of for-loop body
       mov       rcx,rsi
       call      FSharp.Perf.StringLength.notIsNull(System.String)
       inc       edi
       cmp       edi,2711       ; loop 10_000 times
       jl        short M00_L00
       add       rsp,28
       pop       rsi
       pop       rdi
       ret
; Total bytes of code 44
; FSharp.Perf.StringLength.notIsNull(System.String)
       test      rcx,rcx
       je        short M02_L00
       mov       eax,[rcx+8]
       ret
M02_L00:
       xor       eax,eax
       ret
; Total bytes of code 12

Compare that to using one of the not redefinitions, which, with the same code, gives:

; FSharp.Perf.BenchLength.newNot()
       sub       rsp,28
       mov       ecx,[rcx+8]
       call      FSharp.Perf.Data.get(Int32)
       xor       edx,edx
M00_L00:                      ; start of for-loop body
       test      rax,rax
       je        short M00_L01
       mov       ecx,[rax+8]
M00_L01:
       inc       edx
       cmp       edx,2711       ; loop 10_000 times
       jl        short M00_L00
       add       rsp,28
       ret
; Total bytes of code 37

That is: no push/pop of rdi and rsi, that is, no new stackframe.

Known workarounds

Redefine not yourself and the problem seems to disappear.

Related information

I've only tested this on the latest VS + FSC (with optimizations on, of course), but the Sharplab decoding showed the same results.

I discussed this with @baronfel yesterday and neither of us could come up with a reasonable explanation, even more so since re-defining not as itself leads to optimized code, so I'm not sure why the combination not(isNull x) leads to such IL. The Sharplab.io link shows that using something else than isNull in the brackets does not lead to the same weird IL opcodes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Area-LibraryIssues for FSharp.Core not covered elsewhere

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions