-
Notifications
You must be signed in to change notification settings - Fork 18k
cmd/compile: turn nilcheckelim into branchelim #12628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @brtzsnr |
Current ( func f_ssa(e interface{}) int {
if e.(int) < 0 {
return -1
}
return e.(int) * 2
} -> TEXT ·f_ssa(SB), $32-24 // s2.go:3
// MOVQ (TLS), CX (stack growth prologue)
// CMPQ SP, 16(CX)
// JLS 124
// SUBQ $32, SP
// MOVQ BP, 24(SP) (BP save)
// LEAQ 24(SP), BP (BP init)
// FUNCDATA $0, gclocals·522734ad228da40e2256ba19cf2bc72c(SB) (args)
FUNCDATA $1, gclocals·69c1753bd5f81501d95132d08af04464(SB) (locals)
LEAQ type.int(SB), AX // s2.go:4
MOVQ e+0(FP), CX
CMPQ AX, CX
JNE pc96
MOVQ e+8(FP), AX
MOVQ (AX), AX
TESTQ AX, AX
JGE pc78
MOVQ $-1, _r1+16(FP) // s2.go:5
// MOVQ 24(SP), BP (BP restore)
// ADDQ $32, SP (SP restore)
RET
pc78:
SHLQ $1, AX // s2.go:7 <-- NOTE no second check
MOVQ AX, _r1+16(FP)
// MOVQ 24(SP), BP (BP restore)
// ADDQ $32, SP (SP restore)
RET
pc96:
// MOVQ CX, (SP) (stack growth) // s2.go:4
// MOVQ AX, 8(SP)
// LEAQ type.interface {}(SB), AX
// MOVQ AX, 16(SP)
// PCDATA $0, $1
// CALL runtime.panicdottypeE(SB)
// UNDEF
// NOP
// PCDATA $0, $-1 // s2.go:3
// CALL runtime.morestack_noctxt(SB)
// JMP 0 i.e. the typecheck for func f_ssa(i int) int {
var x int
if i < 3 {
x += 1
}
if i < 3 {
x += 2
}
return x
} -> TEXT ·f_ssa(SB), $0-16 // s3.go:3
NO_LOCAL_POINTERS
// FUNCDATA $0, gclocals·f207267fbf96a0178e8758c6e3e0ce28(SB) (args)
// FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) (no locals)
MOVQ i+0(FP), AX // s3.go:5
CMPQ AX, $3
SETLT AL
MOVBLZX AL, AX // s3.go:11
JGE pc21 // s3.go:8
ADDQ $2, AX // s3.go:9
pc21:
MOVQ AX, _r1+8(FP) // s3.go:11
RET only 1 branch but another branch sits implicitly in TEXT ·f_ssa(SB), $0-16 // s3.go:3
NO_LOCAL_POINTERS
// FUNCDATA $0, gclocals·f207267fbf96a0178e8758c6e3e0ce28(SB) (args)
// FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) (no locals)
MOVQ i+0(FP), AX // s3.go:5
CMPQ AX, $3
JGE pc30
MOVQ $5, AX // s3.go:6
pc18:
JGE pc24 // s3.go:8
ADDQ $2, AX // s3.go:9
pc24:
MOVQ AX, _r1+8(FP) // s3.go:11
RET
pc30:
MOVQ $0, AX // s3.go:3
JMP pc18 // s3.go:5 i.e. still 2 branches. |
Let me add that the second case is actually not silly and shows up in practice when the first check is in another function that gets inlined. For example consider BTree lookup code - it needs to compare keys while doing the lookup and take decisions where to go next depending on comparision result. E.g. github.com/cznic/b allows to define such comparision function dynamically: https://github.com/cznic/b/blob/aaaa43c9/btree.go#L52 but some of us generate code from it which calls comparision function directly: https://lab.nexedi.com/kirr/neoppod/blob/7c87537a/t/neo/storage/fs1/fsb/gen-fsbtree#L23 func oidCmp(a, b zodb.Oid) int { // zodb.Oid is uint64
if a < b {
return -1
} else if a > b {
return +1
} else {
return 0
}
} (https://lab.nexedi.com/kirr/neoppod/blob/7c87537a/t/neo/storage/fs1/fsb/cmp.go) The comparision is then used e.g. like this: switch cmp := oidCmp(k, mk); {
case cmp > 0:
l = m + 1
case cmp == 0:
return m, true
default:
h = m - 1
} (https://lab.nexedi.com/kirr/neoppod/blob/7c87537a/t/neo/storage/fs1/fsb/fsbtree.go#L360) so inside oidCmp This BTree lookup code shows at top of profiles for some workloads... A simplified check for this issue is below: package xxx
func zzz2(x, y int) int {
var cmp int
if x < y {
cmp = -1
} else {
cmp = +1
}
if cmp < 0 {
return x - y
} else {
return x + y
}
} which gets compiled to TEXT ·zzz2(SB), $0-24 // switch2.go:3
NO_LOCAL_POINTERS
// FUNCDATA $0, gclocals·54241e171da8af6ae173d69da0236748(SB) (args)
// FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) (no locals)
MOVQ x+0(FP), AX // switch2.go:5
MOVQ y+8(FP), CX
CMPQ AX, CX
JGE pc45
MOVQ $-1, DX // switch2.go:6 <-- NOTE cmp = -1
pc22:
TESTQ DX, DX // switch2.go:11 <--
JGE pc36 // <-- NOTE if cmp < 0
SUBQ CX, AX // switch2.go:12
MOVQ AX, _r2+16(FP)
RET
pc36:
ADDQ CX, AX // switch2.go:14
MOVQ AX, _r2+16(FP)
RET
pc45:
MOVQ $1, DX // switch2.go:8 <-- NOTE cmp = +1
JMP pc22 // switch2.go:5 This seems to be very related to #17862 Thanks beforehand, |
Too late for 1.9. |
nilcheckelim is effective and fast, but only eliminates nil checks. There are other kinds of duplicate branches that it would be nice to also elide.
For example, consider code like:
Each of the type assertions will generate a branch to check e's dynamic type, but the subsequent checks are unnecessary, and could be eliminated with very similar logic to nilcheckelim. Code like this, in which an interface is type asserted repeatedly, shows up in the compiler, so it is not theoretical.
I'd expect the souped-up optimization to also handle (silly) code like:
(The compiled code should only have one branch.)
/cc @randall77 @tzneal @dr2chase
The text was updated successfully, but these errors were encountered: