Ineffectual bitwise or with constant emitted for mask operand of vperm(b|w|d|q|ps|pd)

Same thing as https://github.com/llvm/llvm-project/issues/106256, but also happens for the (avx2/avs512) permute[x]var intrinsics, while the PR https://github.com/llvm/llvm-project/pull/106377 seems to only fix it for (v)pshufb specifically.

Godbolt examples: https://godbolt.org/z/MsTcx7qYc

The vector permute intrinsics ignore all bits except the ones that match the required index size, e.g.:
- vpermb only uses 4, 5, 6 bits out of each mask byte element for 128, 256, 512 bit sized vectors respectively
- vpermw only uses 3, 4, 5 bits out of each 16-bit element in the mask
- etc.

The OR operations with unrelated bits should be optimzied out.

Probably applies to vpermt2 (e.g. _mm512_permutex2var_epi16) also, with 1 more bit used since they selected from two concatenated vectors.

cc @RKSimon 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ineffectual bitwise or with constant emitted for mask operand of vperm(b|w|d|q|ps|pd) #106413

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Ineffectual bitwise or with constant emitted for mask operand of vperm(b|w|d|q|ps|pd) #106413

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions