Skip to content

[GlobalISel] [AArch64] Passthrough i1 stack arguments are unecessarily loaded and stored in tail calls #142847

@guy-david

Description

@guy-david

As a continuation to #126735, i1 stack arguments are truncated twice during lowering, making it difficult to analyze whether they remain the same for tail calls.

With the above patch and for the following IR with two stack arguments %i and %j:

declare void @func_i1(i32 %a, i32 %b, i32 %c, i32 %d, i32 %e, i32 %f, i32 %g, i32 %h, i32 %i, i1 %j)

define void @wrapper_func_i1(i32 %a, i32 %b, i32 %c, i32 %d, i32 %e, i32 %f, i32 %g, i32 %h, i32 %i, i1 %j) {
  tail call void @func_i1(i32 %a, i32 %b, i32 %c, i32 %d, i32 %e, i32 %f, i32 %g, i32 %h, i32 %i, i1 %j)
  ret void
}

The resulting MIR is:

Frame Objects:
  fi#-4: size=1, align=8, fixed, at location [SP+8]
  fi#-3: size=4, align=16, fixed, at location [SP]
  fi#-2: size=1, align=8, fixed, at location [SP+8]
  fi#-1: size=4, align=16, fixed, at location [SP]
Function Live Ins: $w0, $w1, $w2, $w3, $w4, $w5, $w6, $w7

bb.0:
  successors: %bb.1(0x80000000); %bb.1(100.00%)
  liveins: $w0, $w1, $w2, $w3, $w4, $w5, $w6, $w7
  %0:_(s32) = COPY $w0
  %1:_(s32) = COPY $w1
  %2:_(s32) = COPY $w2
  %3:_(s32) = COPY $w3
  %4:_(s32) = COPY $w4
  %5:_(s32) = COPY $w5
  %6:_(s32) = COPY $w6
  %7:_(s32) = COPY $w7
  %11:_(p0) = G_FRAME_INDEX %fixed-stack.3
  %8:_(s32) = G_LOAD %11:_(p0) :: (invariant load (s32) from %fixed-stack.3, align 16)
  %13:_(p0) = G_FRAME_INDEX %fixed-stack.2
  %12:_(s32) = G_LOAD %13:_(p0) :: (invariant load (s8) from %fixed-stack.2, align 8)
  %10:_(s8) = G_TRUNC %12:_(s32)
  %14:_(s8) = G_ASSERT_ZEXT %10:_, 1
  %9:_(s1) = G_TRUNC %14:_(s8)

bb.1 (%ir-block.0):
; predecessors: %bb.0

  %15:_(s8) = G_ZEXT %9:_(s1)
  %16:_(p0) = G_FRAME_INDEX %fixed-stack.1
  %17:_(p0) = G_FRAME_INDEX %fixed-stack.0
  G_STORE %15:_(s8), %17:_(p0) :: (store (s8) into %fixed-stack.0, align 8)
  $w0 = COPY %0:_(s32)
  $w1 = COPY %1:_(s32)
  $w2 = COPY %2:_(s32)
  $w3 = COPY %3:_(s32)
  $w4 = COPY %4:_(s32)
  $w5 = COPY %5:_(s32)
  $w6 = COPY %6:_(s32)
  $w7 = COPY %7:_(s32)
  TCRETURNdi @func_i1, 0, <regmask $fp $lr $wzr $wzr_hi $xzr $b8 $b9 $b10 $b11 $b12 $b13 $b14 $b15 $d8 $d9 $d10 $d11 $d12 $d13 $d14 $d15 $h8 $h9 $h10 $h11 $h12 $h13 $h14 $h15 $s8 $s9 $s10 $s11 and 92 more...>, implicit $sp, implicit $w0, implicit $w1, implicit $w2, implicit $w3, implicit $w4, implicit $w5, implicit $w6, implicit $w7

i1's have special lowering and the chain of instructions store zext truncate assert_zext truncate load isn't trivial to analyze at assignValueToAddress. We can either simplify the handling of i1 to look more like SDAG or incorporate a KnownBits analysis, which is quite an overkill but should do the trick. However, KnownBits will require us to distinguish between i1 and i8 arguments, and I don't see an easy way to do so.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions