Question: What is the correct interpretation of LaneBitmask? #109797

sdesmalen-arm · 2024-09-24T13:07:28Z

While enabling sub-reg liveness for AArch64, we ran into a register allocator issue, which leads us to ask the question: What is the correct interpertation of the bits in LaneBitmask ?

(1) The documentation of LaneBitmask suggests that if A == B that A overlaps with B, but suggests nothing that guarantees that A fully overlaps with B.
(2) The code pointed out in this PR makes the assumption that if A == B, that A fully overlaps with B.

AArch64 defines 16-bit H0 as being the larger register of 8-bit B0, e.g.

  def B0    : AArch64Reg<0,   "b0">, DwarfRegNum<[64]>;
  ... 

  let SubRegIndices = [bsub] in {
    def H0    : AArch64Reg<0,   "h0", [B0]>, DwarfRegAlias<B0>;
    ..
  }

It doesn't define the top bits of H0 as an artificial register (e.g. B0_HI).

In the reproducer the register allocator has to insert a spill. Because 'bsub' (8-bit) and 'hsub' (16-bit), 'ssub' (32-bit), etc. all have the same bitmask, it picks a register class for the COPY instruction using the first subreg-idx that matches. This results in a copy of a 32-bit value, rather than the (desired) full 128-bit value.

If H0 would be defined as B0 and B0_HI (and similar for all other registers), then the following condition evaluates true only for the largest subreg index:

// Early exit if we found a perfect match.
if (SubRegMask == LaneMask) {
BestIdx = Idx;
break;
}

There are several ways to fix this issue;

If statement (2) is correct, then we could define the top bits with an artificial B0_HI register. It would also be nice for TableGen to emit some extra information that LLVM can assert, so that enabling subreg liveness doesn't result in spurious errors.
If statement (1) is correct however, the code (as pointed out in this PR) simply contains a bug that we should fix.

Can someone with more knowledge of the register allocator provide guidance? Thank you!

llvmbot · 2024-09-24T13:08:02Z

@llvm/pr-subscribers-llvm-regalloc

@llvm/pr-subscribers-backend-aarch64

Author: Sander de Smalen (sdesmalen-arm)

Changes

While enabling sub-reg liveness for AArch64, we ran into a register allocator issue, which leads us to ask the question: What is the correct interpertation of the bits in LaneBitmask ?

(1) The documentation of LaneBitmask suggests that if A == B that A overlaps with B, but suggests nothing that guarantees that A fully overlaps with B.
(2) The code pointed out in this PR makes the assumption that if A == B, that A fully overlaps with B.

AArch64 defines 16-bit H0 as being the larger register of 8-bit B0, e.g.

  def B0    : AArch64Reg&lt;0,   "b0"&gt;, DwarfRegNum&lt;[64]&gt;;
  ... 

  let SubRegIndices = [bsub] in {
    def H0    : AArch64Reg&lt;0,   "h0", [B0]&gt;, DwarfRegAlias&lt;B0&gt;;
    ..
  }

It doesn't define the top bits of H0 as an artificial register (e.g. B0_HI).

In the reproducer the register allocator has to insert a spill. Because 'bsub' (8-bit) and 'hsub' (16-bit), 'ssub' (32-bit), etc. all have the same bitmask, it picks a register class for the COPY instruction using the first subreg-idx that matches. This results in a copy of a 32-bit value, rather than the (desired) full 128-bit value.

If H0 would be defined as B0 and B0_HI (and similar for all other registers), then the following condition evaluates true only for the largest subreg index:

// Early exit if we found a perfect match.
if (SubRegMask == LaneMask) {
BestIdx = Idx;
break;
}

There are several ways to fix this issue;

If statement (2) is correct, then we could define the top bits with an artificial B0_HI register. It would also be nice for TableGen to emit some extra information that LLVM can assert, so that enabling subreg liveness doesn't result in spurious errors.
If statement (1) is correct however, the code (as pointed out in this PR) simply contains a bug that we should fix.

Can someone with more knowledge of the register allocator provide guidance? Thank you!

Full diff: https://github.com/llvm/llvm-project/pull/109797.diff

3 Files Affected:

(modified) llvm/lib/CodeGen/TargetRegisterInfo.cpp (+40-2)
(modified) llvm/lib/Target/AArch64/AArch64Subtarget.h (+1)
(added) llvm/test/CodeGen/AArch64/regalloc-subregidx-reproducer.ll (+75)

diff --git a/llvm/lib/CodeGen/TargetRegisterInfo.cpp b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
index ac9a3d6f0d1a60..e62cdf7765c4f1 100644
--- a/llvm/lib/CodeGen/TargetRegisterInfo.cpp
+++ b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
@@ -521,26 +521,62 @@ bool TargetRegisterInfo::getCoveringSubRegIndexes(
   SmallVector<unsigned, 8> PossibleIndexes;
   unsigned BestIdx = 0;
   unsigned BestCover = 0;
-
+  unsigned BestSize = 0;
   for (unsigned Idx = 1, E = getNumSubRegIndices(); Idx < E; ++Idx) {
     // Is this index even compatible with the given class?
     if (getSubClassWithSubReg(RC, Idx) != RC)
       continue;
     LaneBitmask SubRegMask = getSubRegIndexLaneMask(Idx);
+
+    // The code below (now disabled) is not correct when the lane mask does not
+    // cover the full sub-register. For example, when 16-bit H0 has only one
+    // subreg B0 of 8-bits, and the high lanes are not defined in TableGen (e.g.
+    // by defining B0 and some artificial B0_HI register), then this function
+    // may return either `bsub` or `hsub`, whereas in this case we'd want it to
+    // return the *largest* (in bithwidth) sub-register index. It is better to
+    // keep looking for the biggest one, but this is rather expensive.
+    //
+    // Before thinking of how to solve this, I first want to ask the
+    // question: What is the meaning of 'LaneBitmask'?
+    //
+    // According to LaneBitmask.h:
+    //
+    //  "The individual bits in a lane mask can't be assigned
+    //   any specific meaning. They can be used to check if two
+    //   sub-register indices overlap."
+    //
+    // However, in the code here the assumption is made that if
+    // 'SubRegMask == LaneMask', then the subregister must overlap *fully*.
+    // This is not the case for the way AArch64 registers are defined
+    // at the moment.
+    //
+    // Which interpretation is correct?
+    //
+    // A) If the meaning is 'if the bits are equal, the sub-registers overlap but
+    //    not necessarily fully', then we should fix the code in this function (in
+    //    a better way than just disabling it).
+    //
+    // B) If the meaning is 'if the bits are equal, the sub-registers overlap
+    //    fully', then we can define the high bits with an artificial register.
+
+#if 0
     // Early exit if we found a perfect match.
     if (SubRegMask == LaneMask) {
       BestIdx = Idx;
       break;
     }
+#endif
 
     // The index must not cover any lanes outside \p LaneMask.
     if ((SubRegMask & ~LaneMask).any())
       continue;
 
     unsigned PopCount = SubRegMask.getNumLanes();
+    unsigned Size = getSubRegIdxSize(Idx);
     PossibleIndexes.push_back(Idx);
-    if (PopCount > BestCover) {
+    if ((!BestCover || PopCount >= BestCover) && Size >= BestSize) {
       BestCover = PopCount;
+      BestSize = Size;
       BestIdx = Idx;
     }
   }
@@ -559,6 +595,8 @@ bool TargetRegisterInfo::getCoveringSubRegIndexes(
     int BestCover = std::numeric_limits<int>::min();
     for (unsigned Idx : PossibleIndexes) {
       LaneBitmask SubRegMask = getSubRegIndexLaneMask(Idx);
+      // FIXME: similar issue as above.
+
       // Early exit if we found a perfect match.
       if (SubRegMask == LanesLeft) {
         BestIdx = Idx;
diff --git a/llvm/lib/Target/AArch64/AArch64Subtarget.h b/llvm/lib/Target/AArch64/AArch64Subtarget.h
index accfb49c6fbe3a..338afb0cf1962a 100644
--- a/llvm/lib/Target/AArch64/AArch64Subtarget.h
+++ b/llvm/lib/Target/AArch64/AArch64Subtarget.h
@@ -152,6 +152,7 @@ class AArch64Subtarget final : public AArch64GenSubtargetInfo {
   const Triple &getTargetTriple() const { return TargetTriple; }
   bool enableMachineScheduler() const override { return true; }
   bool enablePostRAScheduler() const override { return usePostRAScheduler(); }
+  bool enableSubRegLiveness() const override { return true; }
 
   bool enableMachinePipeliner() const override;
   bool useDFAforSMS() const override { return false; }
diff --git a/llvm/test/CodeGen/AArch64/regalloc-subregidx-reproducer.ll b/llvm/test/CodeGen/AArch64/regalloc-subregidx-reproducer.ll
new file mode 100644
index 00000000000000..c7e96a68779562
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/regalloc-subregidx-reproducer.ll
@@ -0,0 +1,75 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc < %s | FileCheck %s
+
+target triple = "aarch64"
+
+define void @reproducer(ptr %ptr, ptr %ptr2, <8 x i32> %vec.arg) {
+; CHECK-LABEL: reproducer:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    sub sp, sp, #112
+; CHECK-NEXT:    stp x29, x30, [sp, #80] // 16-byte Folded Spill
+; CHECK-NEXT:    stp x20, x19, [sp, #96] // 16-byte Folded Spill
+; CHECK-NEXT:    .cfi_def_cfa_offset 112
+; CHECK-NEXT:    .cfi_offset w19, -8
+; CHECK-NEXT:    .cfi_offset w20, -16
+; CHECK-NEXT:    .cfi_offset w30, -24
+; CHECK-NEXT:    .cfi_offset w29, -32
+; CHECK-NEXT:    mov x8, xzr
+; CHECK-NEXT:    stp q0, q1, [sp, #16] // 32-byte Folded Spill
+; CHECK-NEXT:    mov x19, x1
+; CHECK-NEXT:    ld2 { v0.4s, v1.4s }, [x8]
+; CHECK-NEXT:    add x8, sp, #48
+; CHECK-NEXT:    mov x20, x0
+; CHECK-NEXT:    st1 { v0.2d, v1.2d }, [x8] // 32-byte Folded Spill
+; CHECK-NEXT:    bl bar
+; CHECK-NEXT:    ldp q1, q0, [sp, #16] // 32-byte Folded Reload
+; CHECK-NEXT:    mov w8, #1 // =0x1
+; CHECK-NEXT:    uzp2 v0.4s, v1.4s, v0.4s
+; CHECK-NEXT:    dup v1.2d, x8
+; CHECK-NEXT:    add x8, sp, #48
+; CHECK-NEXT:    ld1 { v5.2d, v6.2d }, [x8] // 32-byte Folded Reload
+; CHECK-NEXT:    ushll2 v2.2d, v6.4s, #1
+; CHECK-NEXT:    ushll v3.2d, v6.2s, #1
+; CHECK-NEXT:    ushll v4.2d, v5.2s, #0
+; CHECK-NEXT:    mov v6.16b, v5.16b
+; CHECK-NEXT:    ushll2 v5.2d, v0.4s, #1
+; CHECK-NEXT:    ushll v0.2d, v0.2s, #1
+; CHECK-NEXT:    orr v3.16b, v3.16b, v4.16b
+; CHECK-NEXT:    ushll2 v6.2d, v6.4s, #0
+; CHECK-NEXT:    orr v0.16b, v0.16b, v1.16b
+; CHECK-NEXT:    orr v2.16b, v2.16b, v6.16b
+; CHECK-NEXT:    stp q0, q3, [sp, #32] // 32-byte Folded Spill
+; CHECK-NEXT:    orr v0.16b, v5.16b, v1.16b
+; CHECK-NEXT:    stp q0, q2, [sp] // 32-byte Folded Spill
+; CHECK-NEXT:    bl bar
+; CHECK-NEXT:    ldr q1, [sp, #16] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr q0, [sp, #48] // 16-byte Folded Reload
+; CHECK-NEXT:    ldp x29, x30, [sp, #80] // 16-byte Folded Reload
+; CHECK-NEXT:    stp q0, q1, [x20]
+; CHECK-NEXT:    ldr q1, [sp] // 16-byte Folded Reload
+; CHECK-NEXT:    ldr q0, [sp, #32] // 16-byte Folded Reload
+; CHECK-NEXT:    stp q0, q1, [x19]
+; CHECK-NEXT:    ldp x20, x19, [sp, #96] // 16-byte Folded Reload
+; CHECK-NEXT:    add sp, sp, #112
+; CHECK-NEXT:    ret
+entry:
+  %wide.vec = load <8 x i32>, ptr null, align 4
+  call void @bar()
+  %strided.vec = shufflevector <8 x i32> %wide.vec, <8 x i32> poison, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
+  %strided.vec2 = shufflevector <8 x i32> %wide.vec, <8 x i32> poison, <4 x i32> <i32 1, i32 3, i32 5, i32 7>
+  %strided.vec3 = shufflevector <8 x i32> %vec.arg, <8 x i32> poison, <4 x i32> <i32 1, i32 3, i32 5, i32 7>
+  %1 = zext <4 x i32> %strided.vec2 to <4 x i64>
+  %2 = zext <4 x i32> %strided.vec3 to <4 x i64>
+  %3 = shl <4 x i64> %1, <i64 1, i64 1, i64 1, i64 1>
+  %4 = shl <4 x i64> %2, <i64 1, i64 1, i64 1, i64 1>
+  %5 = zext <4 x i32> %strided.vec to <4 x i64>
+  %6 = or <4 x i64> %3, %5
+  %7 = or <4 x i64> %4, <i64 1, i64 1, i64 1, i64 1>
+  call void @bar()
+  store <4 x i64> %6, ptr %ptr, align 8
+  store <4 x i64> %7, ptr %ptr2, align 8
+  ret void
+}
+
+declare void @bar()
+

github-actions · 2024-09-24T13:11:10Z

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:

git-clang-format --diff 3c83102f0615c7d66f6df698ca472ddbf0e9483d 0f1ed85616670fe321918f199a9e9baedc50d578 --extensions cpp,h -- llvm/lib/CodeGen/TargetRegisterInfo.cpp llvm/lib/Target/AArch64/AArch64Subtarget.h

View the diff from clang-format here.

diff --git a/llvm/lib/CodeGen/TargetRegisterInfo.cpp b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
index e62cdf7765..f4a2feac7c 100644
--- a/llvm/lib/CodeGen/TargetRegisterInfo.cpp
+++ b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
@@ -552,9 +552,10 @@ bool TargetRegisterInfo::getCoveringSubRegIndexes(
     //
     // Which interpretation is correct?
     //
-    // A) If the meaning is 'if the bits are equal, the sub-registers overlap but
-    //    not necessarily fully', then we should fix the code in this function (in
-    //    a better way than just disabling it).
+    // A) If the meaning is 'if the bits are equal, the sub-registers overlap
+    // but
+    //    not necessarily fully', then we should fix the code in this function
+    //    (in a better way than just disabling it).
     //
     // B) If the meaning is 'if the bits are equal, the sub-registers overlap
     //    fully', then we can define the high bits with an artificial register.

arsenm · 2024-09-24T13:26:47Z

As it is today, I believe the intent is statement 1. However, I think we would be in a better place if we did require 2. We do have a number of issues whenever there are pieces of registers that aren't fully covered by a defined register, and it is harder to reason about.

We probably need to require fully covering artificial register definitions to fully move the infrastructure to using register units (e.g. replacing livein tracking with regunits instead of registers, and for using regunit masks instead of regmasks)

jayfoad · 2024-09-24T13:38:23Z

As it is today, I believe the intent is statement 1. However, I think we would be in a better place if we did require 2.

Agreed.

We do have a number of issues whenever there are pieces of registers that aren't fully covered by a defined register, and it is harder to reason about.

We probably need to require fully covering artificial register definitions to fully move the infrastructure to using register units (e.g. replacing livein tracking with regunits instead of registers, and for using regunit masks instead of regmasks)

This was discussed in some detail on #95926 (also see #96146). Also also see #79831 (comment).

sdesmalen-arm · 2024-09-24T13:58:09Z

Thanks for the quick feedback!

We probably need to require fully covering artificial register definitions to fully move the infrastructure to using register units (e.g. replacing livein tracking with regunits instead of registers, and for using regunit masks instead of regmasks)

Does the approach of using register units (instead of registers) have any impact on the existence or future use of the LaneBitmask?

As it is today, I believe the intent is statement 1. However, I think we would be in a better place if we did require 2. We do have a number of issues whenever there are pieces of registers that aren't fully covered by a defined register, and it is harder to reason about.

I have some local patches that define registers/subreg-indices for the top bits, but then ran into an issue that the uint64_t to represent the LaneBitmask is no longer sufficient. It can be changed to use an APInt but I'm not sure if there's a way to define the BitWidth on a per-target basis (at least, not without passing some Subtarget to the LaneBitmask constructor). I haven't yet tried to see what the impact is on compile-time/memory-usage, or whether there is a way to limit this on a per-target basis, but this is a downside of having to define the top bits.

arsenm · 2024-09-24T14:36:21Z

Does the approach of using register units (instead of registers) have any impact on the existence or future use of the LaneBitmask?

No. In an ideal world you could directly map bits

I have some local patches that define registers/subreg-indices for the top bits, but then ran into an issue that the uint64_t to represent the LaneBitmask is no longer sufficient.

How big are these registers? AMDGPU is sitting right at the 64-bit limit now

jayfoad · 2024-09-24T14:48:10Z

I have some local patches that define registers/subreg-indices for the top bits, but then ran into an issue that the uint64_t to represent the LaneBitmask is no longer sufficient.

Is this for the regular AArch64 integer registers? My understanding is:

AArch64 registers work like this: b10 aliases the low 8 bits of h10 which aliases the low 16 bits of s10 which aliases the low 32 bits of d10 which aliases the low 64 bits of q10 which aliases the low 128 bits of z10.

So I would expect you would need subregisters for:

bits 7..0
bits 15..8
bits 31..16
bits 63..32
bits 127..64
bits 255..128

I.e. roughly log2(bitwidth) of them.

sdesmalen-arm · 2024-09-24T16:06:51Z

I have some local patches that define registers/subreg-indices for the top bits, but then ran into an issue that the uint64_t to represent the LaneBitmask is no longer sufficient.

Is this for the regular AArch64 integer registers? My understanding is:

AArch64 registers work like this: b10 aliases the low 8 bits of h10 which aliases the low 16 bits of s10 which aliases the low 32 bits of d10 which aliases the low 64 bits of q10 which aliases the low 128 bits of z10.

So I would expect you would need subregisters for:

bits 7..0

bits 15..8

bits 31..16

bits 63..32

bits 127..64

bits 255..128

I.e. roughly log2(bitwidth) of them.

That's right, however it is the register tuples that make things take up space, as it replicates the bits for each subvector in the tuple. It ends up with something like this:

qqsub_0: bsub, bsub_hi, hsub, hsub_hi, ... qsub, qsub_hi
qqsub_1: bsub, bsub_hi, hsub, hsub_hi, ... qsub, qsub_hi

and similar for other tuples, where each sub register must be separately addressable and cannot reuse the bits (otherwise qqsub_0:bsub and qqsub_1:bsub would alias for example). Then there are also bits to represent the indices in GPR registers a set of bits needed to express the (sub)tiles in a matrix register. With all of these together in the same bitmask, we run out of the 64 bits.

This follows on from the conversation on llvm#109797. FWIW, I've considered several alternatives to this patch; (1) Using APInt as storage type rather than 'uint64_t Mask[2]'. (1) makes the code a bit simpler to read, but APInt by default only allocates space for a 64-bit value and otherwise dynamically allocates a larger buffer to represent the larger value. Because we know the value is always 128bit, this extra dynamic allocation is undesirable. We also rarely need the full power of APInt since most of the tests are bitwise operations, so I made the choice to represent it as a uint64_t array instead, and only moving to/from APInt when this is necessary. Because it is inconvenient that increasing the BitWidth applies to all targets, I tried to see if there is a way to make the bitwidth dynamic, by: (2) Making the BitWidth dynamic per target by passing it to constructors. (3) Modification of (2) that describes 'none', 'all' and 'lane' with an an enum which doesn't require a BitWidth, until doing arithmetic with an explicit bit mask (which does have a BitWidth). Unfortunately both these approaches don't seem feasible. For (2) that is because it would require passing the TargetRegisterInfo/MCRegisterInfo to many places where this info is not available, where it needs to instantiates a LaneBitmask value. Approach (3) leads to other issues such as questions like 'what is the meaning of 'operator==' when one value is a mask and the other is a 'all' enum?' If we let 'operator==' discard the bitwidth such that a 64-bit all-true bitmask == LaneBitmask::all() (using 'all' enum), then we could end up in a situation where: X == LaneBitmask::all() && Y == LaneBitmask::all() but `X != Y`. I considered replacing the equality operators by methods that take a RegisterInfo pointer, but the LaneBitmask struct is used in STL containers which require a plain 'operator==' or 'operator<'. We could work around that by providing custom lambdas (that call the method with the TargetInfo pointer), but this just gets increasingly more hacky. Perhaps just using more bits isn't actually that bad in practice.

qcolombet · 2024-10-18T14:09:48Z

llvm/lib/CodeGen/TargetRegisterInfo.cpp

+    //
+    // A) If the meaning is 'if the bits are equal, the sub-registers overlap but
+    //    not necessarily fully', then we should fix the code in this function (in
+    //    a better way than just disabling it).


Like @arsenm said this is the correct interpretation.

Thanks for confirming! Now that this is cleared up, I'll close this PR.

This is a step towards enabling subreg liveness tracking for AArch64, which requires that registers are fully covered by their subregisters, as covered here llvm#109797. There are several changes in this patch: * AArch64RegisterInfo.td and tests: Define the high bits like B0_HI, H0_HI, S0_HI, D0_HI, Q0_HI. Because the bits must be defined by some register class, this added a register class which meant that we had to update 'magic numbers' in several tests. The use of ComposedSubRegIndex helped 'compress' the number of bits required for the lanemask. The correctness of the masks is tested by an explicit unit tests. * LoadStoreOptimizer: previously 'HasDisjunctSubRegs' was only true for register tuples, but with this change to describe the high bits, a register like 'D0' will also have 'HasDisjunctSubRegs' set to true (because it's fullly covered by S0 and S0_HI). The fix here is to explicitly test if the register class is one of the known D/Q/Z tuples. * TableGen: The handling of the isArtificial flag was entirely broken. Skipping out too early from some of the loops led to incorrect internal representation of the (sub)register(index) hierarchy, and thus resulted in incorrect TableGen info.

This is a step towards enabling subreg liveness tracking for AArch64, which requires that registers are fully covered by their subregisters, as covered here #109797. There are several changes in this patch: * AArch64RegisterInfo.td and tests: Define the high bits like B0_HI, H0_HI, S0_HI, D0_HI, Q0_HI. Because the bits must be defined by some register class, this added a register class which meant that we had to update 'magic numbers' in several tests. The use of ComposedSubRegIndex helped 'compress' the number of bits required for the lanemask. The correctness of the masks is tested by an explicit unit tests. * LoadStoreOptimizer: previously 'HasDisjunctSubRegs' was only true for register tuples, but with this change to describe the high bits, a register like 'D0' will also have 'HasDisjunctSubRegs' set to true (because it's fullly covered by S0 and S0_HI). The fix here is to explicitly test if the register class is one of the known D/Q/Z tuples.

sdesmalen-arm added 2 commits September 24, 2024 14:01

Add description and test (with sub-reg liveness enabled)

23581e3

Comment out code that assumes 'A == B' means 'A fully overlaps with B'.

0f1ed85

sdesmalen-arm requested review from arsenm, qcolombet and topperc September 24, 2024 13:07

llvmbot added the backend:AArch64 label Sep 24, 2024

sdesmalen-arm marked this pull request as draft September 24, 2024 13:11

arsenm added the llvm:regalloc label Sep 24, 2024

arsenm requested a review from jayfoad September 24, 2024 13:21

sdesmalen-arm mentioned this pull request Oct 4, 2024

[CodeGen] Use 128bits for LaneBitmask. #111157

Closed

qcolombet reviewed Oct 18, 2024

View reviewed changes

sdesmalen-arm closed this Oct 18, 2024

sdesmalen-arm mentioned this pull request Oct 30, 2024

[AArch64] Define high bits of FPR and GPR registers. #114263

Closed

sdesmalen-arm mentioned this pull request Nov 4, 2024

[AArch64] Define high bits of FPR and GPR registers (take 2) #114827

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question: What is the correct interpretation of LaneBitmask? #109797

Question: What is the correct interpretation of LaneBitmask? #109797

Uh oh!

sdesmalen-arm commented Sep 24, 2024

Uh oh!

llvmbot commented Sep 24, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Sep 24, 2024

Uh oh!

arsenm commented Sep 24, 2024

Uh oh!

jayfoad commented Sep 24, 2024

Uh oh!

sdesmalen-arm commented Sep 24, 2024

Uh oh!

arsenm commented Sep 24, 2024

Uh oh!

jayfoad commented Sep 24, 2024

Uh oh!

sdesmalen-arm commented Sep 24, 2024

Uh oh!

qcolombet Oct 18, 2024

Uh oh!

sdesmalen-arm Oct 18, 2024

Uh oh!

Uh oh!

Question: What is the correct interpretation of LaneBitmask? #109797

Question: What is the correct interpretation of LaneBitmask? #109797

Uh oh!

Conversation

sdesmalen-arm commented Sep 24, 2024

Uh oh!

llvmbot commented Sep 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 24, 2024

Uh oh!

arsenm commented Sep 24, 2024

Uh oh!

jayfoad commented Sep 24, 2024

Uh oh!

sdesmalen-arm commented Sep 24, 2024

Uh oh!

arsenm commented Sep 24, 2024

Uh oh!

jayfoad commented Sep 24, 2024

Uh oh!

sdesmalen-arm commented Sep 24, 2024

Uh oh!

qcolombet Oct 18, 2024

Choose a reason for hiding this comment

Uh oh!

sdesmalen-arm Oct 18, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvmbot commented Sep 24, 2024 •

edited

Loading