Skip to content

Combine shuffle(fneg(x),fneg(y)) -> fneg(shuffle(x,y)) #45631

@RKSimon

Description

@RKSimon
Bugzilla Link 46286
Version trunk
OS Windows NT
CC @rotateright

Extended Description

https://godbolt.org/z/cHgY_S

For cases such as:

define <4 x float> @fneg_concat_v2f32(<2 x float> %a0, <2 x float> %a1) {
  %1 = fneg <2 x float> %a0
  %2 = fneg <2 x float> %a1
  %3 = shufflevector <2 x float> %1, <2 x float> %2, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
  ret <4 x float> %3
}
define <4 x float> @fneg_concat_v4f32(<4 x float> %a0, <4 x float> %a1) {
  %1 = fneg <4 x float> %a0
  %2 = fneg <4 x float> %a1
  %3 = shufflevector <4 x float> %1, <4 x float> %2, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
  ret <4 x float> %3
}

we are almost certainly better off moving the fneg after the shuffle:

define <4 x float> @concat_fneg_v2f32(<2 x float> %a0, <2 x float> %a1) {
  %1 = shufflevector <2 x float> %a0, <2 x float> %a1, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
  %2 = fneg <4 x float> %1
  ret <4 x float> %2
}
define <4 x float> @concat_fneg_v4f32(<4 x float> %a0, <4 x float> %a1) {
  %1 = shufflevector <4 x float> %a0, <4 x float> %a1, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
  %2 = fneg <4 x float> %1
  ret <4 x float> %2
}

Binops would probably benefit in some cases (constant operand?) as well.

The issue that vectorcombine might encounter though is that we fail to get costs for most length changing shuffles, so the 'concat_vectors' shuffle pattern returns an 'Unknown' cost.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugzillaIssues migrated from bugzillallvm:instcombineCovers the InstCombine, InstSimplify and AggressiveInstCombine passes

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions