Skip to content

Conversation

@RiverDave
Copy link
Collaborator

@RiverDave RiverDave commented Jul 30, 2025

While working on _mm_movepi8_mask, intrinsic (and similar sign-bit checking intrinsics containing 8-bit integers) was being optimized away when using -fno-signed-char. Effectively replacing a cmp expression for 0.

Since unsigned values can never be less than zero, the CIR lowering was directly generating a constant 0 (I suppose we fold a vec filled with 0's to our target, which is a scalar mask, which in turn is 0) instead of the intended comparison operation, completely eliminating the icmp instruction.

See when passing as arg -fno-signed-char (no cmp generated):

OG:

define dso_local i16 @test_mm_movepi8_mask(<2 x i64> %0) #0 {
...
  %8 = bitcast <2 x i64> %7 to <16 x i8>
  %9 = icmp slt <16 x i8> %8, zeroinitializer
  store i16 %10, ptr %3, align 2
  ...
  ret i16 %12
}

CIR:

define dso_local i16 @test_mm_movepi8_mask(<2 x i64> %0) #0 {
...
  %8 = bitcast <2 x i64> %7 to <16 x i8>
  store i16 0, ptr %3, align 2 // I believe our vector cmp is folded here?
...
  ret i16 %10
}

Since integer signedness is something we can track, the behaviour CIR is enforcing makes sense; however, if we want to preserve parity with OG, I believe this patch will match that. I can close this PR if that's not applicable to this case.

Added special case detection for sign-bit extraction patterns (lt comparison with cir::ZeroAttr) to force signed comparison regardless of the element type's signedness. This preserves the semantic intent of checking sign bits rather than performing mathematical unsigned comparisons.

@RiverDave RiverDave added the IR difference A difference in ClangIR-generated LLVM IR that could complicate reusing original CodeGen tests label Jul 31, 2025
@bcardosolopes
Copy link
Member

We just went through a rebase, this PR needs to be updated.

@RiverDave
Copy link
Collaborator Author

We just went through a rebase, this PR needs to be updated.

Updated this along with my other currently open PR's

@bcardosolopes
Copy link
Member

clang/test/CIR/Lowering/vec-cmp.cir seems to be failing!

@RiverDave
Copy link
Collaborator Author

clang/test/CIR/Lowering/vec-cmp.cir seems to be failing!

My apologies. Tests seems to be fixed now. I'm still waiting for your follow up on the discussion above in order to push forward this PR if required. Thanks 😃

RiverDave added a commit that referenced this pull request Sep 8, 2025
Three things:

- Corrected comments to `getZeroInitAttr` as [we return more than only
integrals in that
function](https://github.com/llvm/clangir/blob/2ea4005fa0aa291295b19c200860b5edf9b864b3/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h#L133).
- Given that `emitX86MaskedCompare` and `emitX86MaskedCompareResult`
helpers are pretty large, Added NYI statements on paths not related to
the current set of intrinsics so review is specific to the ones encoded.
- Added test comments related to the behavior observed coming from the
canonicalizer on: #1770
tommymcm pushed a commit to tommymcm/clangir that referenced this pull request Sep 10, 2025
Three things:

- Corrected comments to `getZeroInitAttr` as [we return more than only
integrals in that
function](https://github.com/llvm/clangir/blob/2ea4005fa0aa291295b19c200860b5edf9b864b3/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h#L133).
- Given that `emitX86MaskedCompare` and `emitX86MaskedCompareResult`
helpers are pretty large, Added NYI statements on paths not related to
the current set of intrinsics so review is specific to the ones encoded.
- Added test comments related to the behavior observed coming from the
canonicalizer on: llvm#1770
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

IR difference A difference in ClangIR-generated LLVM IR that could complicate reusing original CodeGen tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants