Skip to content

Miscompile of 1 << ZEXT(CTTZ(X)) #94824

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
meheff opened this issue Jun 8, 2024 · 3 comments · Fixed by #94850
Closed

Miscompile of 1 << ZEXT(CTTZ(X)) #94824

meheff opened this issue Jun 8, 2024 · 3 comments · Fixed by #94850
Assignees
Labels
llvm:SelectionDAG SelectionDAGISel as well miscompilation

Comments

@meheff
Copy link

meheff commented Jun 8, 2024

The IR below miscompiles on x86 (at least). I believe the root cause is #85066 which performs the following optimization:

1 << CTTZ(X)   =>  -X & X

I think the ZEXT confuses things such that the value X=0 is incorrectly computed.

IR:

; ModuleID = '__module'
source_filename = "__module"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-grtev4-linux-gnu"

define private i16 @src(i8 %_x1) {
entry:
  %_0 = call i8 @llvm.cttz.i8(i8 %_x1, i1 false)
  %_3 = zext i8 %_0 to i16
  %_4 = shl i16 1, %_3
  ret i16 %_4
}

Good llc built at 765206e

./llc.good min.ll -o -
	.text
	.file	"__module"
	.p2align	4, 0x90                         # -- Begin function src
	.type	.Lsrc,@function
.Lsrc:                                  # @src
	.cfi_startproc
# %bb.0:                                # %entry
	orl	$256, %edi                      # imm = 0x100
	rep		bsfl	%edi, %ecx
	movl	$1, %eax
                                        # kill: def $cl killed $cl killed $ecx
	shll	%cl, %eax
                                        # kill: def $ax killed $ax killed $eax
	retq
.Lfunc_end0:
	.size	.Lsrc, .Lfunc_end0-.Lsrc
	.cfi_endproc
                                        # -- End function
	.section	".note.GNU-stack","",@progbits

Bad llc built at 6f2c610

This produces 0 for input 0, result should be 0x100.

./llc.bad min.ll -o -
	.text
	.file	"__module"
	.p2align	4, 0x90                         # -- Begin function src
	.type	.Lsrc,@function
.Lsrc:                                  # @src
	.cfi_startproc
# %bb.0:                                # %entry
	movl	%edi, %eax
	negb	%al
	andb	%dil, %al
	movzbl	%al, %eax
                                        # kill: def $ax killed $ax killed $eax
	retq
.Lfunc_end0:
	.size	.Lsrc, .Lfunc_end0-.Lsrc
	.cfi_endproc
                                        # -- End function
	.section	".note.GNU-stack","",@progbits
@llvmbot
Copy link
Member

llvmbot commented Jun 8, 2024

@llvm/issue-subscribers-backend-x86

Author: None (meheff)

The IR below miscompiles on x86 (at least). I believe the root cause is https://github.com//pull/85066 which performs the following optimization:
1 &lt;&lt; CTTZ(X)   =&gt;  -X &amp; X

I think the ZEXT confuses things such that the value X=0 is incorrectly computed.

IR:

; ModuleID = '__module'
source_filename = "__module"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-grtev4-linux-gnu"

define private i16 @<!-- -->src(i8 %_x1) {
entry:
  %_0 = call i8 @<!-- -->llvm.cttz.i8(i8 %_x1, i1 false)
  %_3 = zext i8 %_0 to i16
  %_4 = shl i16 1, %_3
  ret i16 %_4
}

Good llc built at 765206e

./llc.good min.ll -o -
	.text
	.file	"__module"
	.p2align	4, 0x90                         # -- Begin function src
	.type	.Lsrc,@<!-- -->function
.Lsrc:                                  # @<!-- -->src
	.cfi_startproc
# %bb.0:                                # %entry
	orl	$256, %edi                      # imm = 0x100
	rep		bsfl	%edi, %ecx
	movl	$1, %eax
                                        # kill: def $cl killed $cl killed $ecx
	shll	%cl, %eax
                                        # kill: def $ax killed $ax killed $eax
	retq
.Lfunc_end0:
	.size	.Lsrc, .Lfunc_end0-.Lsrc
	.cfi_endproc
                                        # -- End function
	.section	".note.GNU-stack","",@<!-- -->progbits

Bad llc built at 6f2c610

This produces 0 for input 0, result should be 0x100.

./llc.bad min.ll -o -
	.text
	.file	"__module"
	.p2align	4, 0x90                         # -- Begin function src
	.type	.Lsrc,@<!-- -->function
.Lsrc:                                  # @<!-- -->src
	.cfi_startproc
# %bb.0:                                # %entry
	movl	%edi, %eax
	negb	%al
	andb	%dil, %al
	movzbl	%al, %eax
                                        # kill: def $ax killed $ax killed $eax
	retq
.Lfunc_end0:
	.size	.Lsrc, .Lfunc_end0-.Lsrc
	.cfi_endproc
                                        # -- End function
	.section	".note.GNU-stack","",@<!-- -->progbits

@topperc
Copy link
Collaborator

topperc commented Jun 8, 2024

CC @dtcxzyw

@topperc
Copy link
Collaborator

topperc commented Jun 8, 2024

The patch description for a previous bug fix says

    + When VT is smaller than ShiftVT, it is safe to use trunc.
    + When VT is larger than ShiftVT, it is safe to use zext iff
    `is_zero_poison` is true (i.e., `opcode == ISD::CTTZ_ZERO_UNDEF`). See
    also the counterexample `src_shl_cttz2 -> tgt_shl_cttz2` in the alive2
    proofs.

But the code seems to check that VT is larger than ShiftVT.

  if (((N1.getOpcode() == ISD::CTTZ &&                                           
        VT.getScalarSizeInBits() >= ShiftVT.getScalarSizeInBits()) ||            
       N1.getOpcode() == ISD::CTTZ_ZERO_UNDEF) && 

@dtcxzyw dtcxzyw self-assigned this Jun 8, 2024
dtcxzyw added a commit that referenced this issue Jun 8, 2024
The pr description in #94008 mismatches with the code.
> + When VT is smaller than ShiftVT, it is safe to use trunc.
> + When VT is larger than ShiftVT, it is safe to use zext iff
`is_zero_poison` is true (i.e., `opcode == ISD::CTTZ_ZERO_UNDEF`). See
also the counterexample `src_shl_cttz2 -> tgt_shl_cttz2` in the alive2
    proofs.

Closes #94824.
@EugeneZelenko EugeneZelenko added llvm:SelectionDAG SelectionDAGISel as well and removed backend:X86 labels Jun 8, 2024
nekoshirro pushed a commit to nekoshirro/Alchemist-LLVM that referenced this issue Jun 9, 2024
The pr description in llvm#94008 mismatches with the code.
> + When VT is smaller than ShiftVT, it is safe to use trunc.
> + When VT is larger than ShiftVT, it is safe to use zext iff
`is_zero_poison` is true (i.e., `opcode == ISD::CTTZ_ZERO_UNDEF`). See
also the counterexample `src_shl_cttz2 -> tgt_shl_cttz2` in the alive2
    proofs.

Closes llvm#94824.

Signed-off-by: Hafidz Muzakky <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:SelectionDAG SelectionDAGISel as well miscompilation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants