Skip to content

Defining 0 for clz does not optimize #10564

@llvmbot

Description

@llvmbot
Bugzilla Link 10192
Version trunk
OS Windows NT
Reporter LLVM Bugzilla Contributor
CC @asl,@chandlerc,@lattner,@64

Extended Description

Howdy... GCC leaves __builtin_clz undefined for zero, so writing code in Clang like

return 31 - (value == 0 ? 32 : __builtin_clz(value));

You would think the 'value == 0 ? 32 : ' would get optimized out, since CLZ on ARM is defined as 32 for value 0. However that does not appear to be the case. On Thumb 2 I get the following output for the above snippet:

00000000 <uint32_log2_floor>:
0: 2120 movs r1, #​32
2: 2800 cmp r0, #​0
4: bf18 it ne
6: fab0 f180 clzne r1, r0
a: f1c1 001f rsb r0, r1, #​31 ; 0x1f
e: 4770 bx lr

It would be nice if LLVM could recognize this situation, where the branch is identical to the defined behavior for the CPU instruction.

Thanks

James

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions