[X86] Suboptimal lowering of short vectors equality check: could use scalar types instead

Motivating case: https://godbolt.org/z/rbE3TzqdP

The original test
```
define i1 @vector_version(i8* align 1 %arg, i8* align 1 %arg1, i32 %arg2) {
bb:
  %ptr1 = bitcast i8* %arg1 to <4 x i8>*
  %ptr2 = bitcast i8* %arg to <4 x i8>*
  %lhs = load <4 x i8>, <4 x i8>* %ptr1, align 1
  %rhs = load <4 x i8>, <4 x i8>* %ptr2, align 1
  %any_ne = icmp ne <4 x i8> %lhs, %rhs
  %any_ne_scalar = bitcast <4 x i1> %any_ne to i4
  %all_eq = icmp eq i4 %any_ne_scalar, 0
  ret i1 %all_eq
}
```
reads two short vector values and effectively checks that they are equal. Codegen generates vector code from it:
```
vector_version:                         # @vector_version
        vpmovzxbd       (%rsi), %xmm0           # xmm0 = mem[0],zero,zero,zero,mem[1],zero,zero,zero,mem[2],zero,zero,zero,mem[3],zero,zero,zero
        vpmovzxbd       (%rdi), %xmm1           # xmm1 = mem[0],zero,zero,zero,mem[1],zero,zero,zero,mem[2],zero,zero,zero,mem[3],zero,zero,zero
        vpsubd  %xmm1, %xmm0, %xmm0
        vptest  %xmm0, %xmm0
        sete    %al
        retq
```
This code is semantically equivalent to its scalar counterpart
```
define i1 @scalar_version(i8* align 1 %arg, i8* align 1 %arg1, i32 %arg2) {
bb:
  %ptr1 = bitcast i8* %arg1 to i32*
  %ptr2 = bitcast i8* %arg to i32*
  %lhs = load i32, i32* %ptr1, align 1
  %rhs = load i32, i32* %ptr2, align 1
  %all_eq = icmp eq i32 %lhs, %rhs
  ret i1 %all_eq
}

```
which produces neater asm:
```
scalar_version:                         # @scalar_version
        movl    (%rsi), %eax
        cmpl    (%rdi), %eax
        sete    %al
        retq
```
 Unfortunately we cannot use RM vector sub here as stated in https://github.com/llvm/llvm-project/issues/53416, but it looks like we could give up using vector registers at all.

Not sure what is the proper place for this - codegen or instcombine.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[X86] Suboptimal lowering of short vectors equality check: could use scalar types instead #53419

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[X86] Suboptimal lowering of short vectors equality check: could use scalar types instead #53419

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions