Skip to content

[AArch64][PAC] Lower auth/resign into checked sequence. #79024

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ahmedbougacha
Copy link
Member

@ahmedbougacha ahmedbougacha commented Jan 22, 2024

This introduces 3 hardening modes in the authentication step of
auth/resign lowering:

  • unchecked, which uses the AUT instructions as-is
  • poison, which detects authentication failure (using an XPAC+CMP
    sequence), explicitly yielding the XPAC result rather than the
    AUT result, to avoid leaking
  • trap, which additionally traps on authentication failure,
    using BRK #0xC470 + key (IA C470, IB C471, DA C472, DB C473.)

Not all modes are necessarily useful in all contexts, and there
are more performant alternative lowerings in specific contexts
(e.g., when I/D TBI enablement is a target ABI guarantee.)

This is controlled by the ptrauth-auth-traps function attributes,
and can be overridden using -aarch64-ptrauth-auth-checks=.

This also explicitly describes the FPAC extension, to then use it
to improve the above isel to rely on HW checking.

Copy link

github-actions bot commented Jan 22, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@ahmedbougacha ahmedbougacha force-pushed the eng/abougacha/ptrauth-auth-resign-isel branch from 20527de to fc0edb3 Compare January 22, 2024 23:43
@atrosinenko
Copy link
Contributor

Some time ago I uploaded a re-implemented version of @llvm.ptrauth.auth intrinsic lowering: #72502 (currently marked as draft due to lesser patches being factored out from it that are on review now). Is the original implementation planned for upstreaming?

@ahmedbougacha
Copy link
Member Author

Some time ago I uploaded a re-implemented version of @llvm.ptrauth.auth intrinsic lowering: #72502 (currently marked as draft due to lesser patches being factored out from it that are on review now). Is the original implementation planned for upstreaming?

Why yes indeed, all the changes in the upstreaming branch are intended to be upstreamed.

Can you describe the differences that necessitated a reimplementation? (of auth as well as blend–I see that's already been merged.)

@ahmedbougacha ahmedbougacha force-pushed the eng/abougacha/ptrauth-auth-resign-isel branch from fc0edb3 to de7afda Compare March 19, 2024 05:22
@ahmedbougacha ahmedbougacha marked this pull request as ready for review March 19, 2024 05:31
@llvmbot
Copy link
Member

llvmbot commented Mar 19, 2024

@llvm/pr-subscribers-backend-aarch64

Author: Ahmed Bougacha (ahmedbougacha)

Changes

This introduces 3 hardening modes in the authentication step of
auth/resign lowering:

  • unchecked, which uses the AUT instructions as-is
  • poison, which detects authentication failure (using an XPAC+CMP
    sequence), explicitly yielding the XPAC result rather than the
    AUT result, to avoid leaking
  • trap, which additionally traps on authentication failure,
    using BRK #0xC470 + key (IA C470, IB C471, DA C472, DB C473.)

Not all modes are necessarily useful in all contexts, and there
are more performant alternative lowerings in specific contexts
(e.g., when I/D TBI enablement is a target ABI guarantee.)

This is controlled by the ptrauth-auth-traps function attributes,
and can be overridden using -aarch64-ptrauth-auth-checks=.

This also explicitly describes the FPAC extension, to then use it
to improve the above isel to rely on HW checking.


Patch is 67.56 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79024.diff

10 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64.td (+3)
  • (modified) llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp (+270)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp (+102)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+32)
  • (modified) llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.cpp (+29)
  • (modified) llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.h (+6)
  • (modified) llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp (+58)
  • (added) llvm/test/CodeGen/AArch64/ptrauth-fpac.ll (+374)
  • (added) llvm/test/CodeGen/AArch64/ptrauth-intrinsic-auth-resign-with-blend.ll (+261)
  • (added) llvm/test/CodeGen/AArch64/ptrauth-intrinsic-auth-resign.ll (+764)
diff --git a/llvm/lib/Target/AArch64/AArch64.td b/llvm/lib/Target/AArch64/AArch64.td
index 402c7292d7f81c..f719d63e1986b8 100644
--- a/llvm/lib/Target/AArch64/AArch64.td
+++ b/llvm/lib/Target/AArch64/AArch64.td
@@ -460,6 +460,9 @@ def FeatureMatMulFP32 : SubtargetFeature<"f32mm", "HasMatMulFP32",
 def FeatureMatMulFP64 : SubtargetFeature<"f64mm", "HasMatMulFP64",
     "true", "Enable Matrix Multiply FP64 Extension (FEAT_F64MM)", [FeatureSVE]>;
 
+def FeatureFPAC : SubtargetFeature<"fpac", "HasFPAC",
+    "true", "Enable Armv8.6-A Pointer Authentication Faulting enhancement">;
+
 def FeatureXS : SubtargetFeature<"xs", "HasXS",
     "true", "Enable Armv8.7-A limited-TLB-maintenance instruction (FEAT_XS)">;
 
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index 4fa719ad67cf33..6d34e16fc43401 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -67,6 +67,15 @@
 
 using namespace llvm;
 
+enum PtrauthCheckMode { Default, Unchecked, Poison, Trap };
+static cl::opt<PtrauthCheckMode> PtrauthAuthChecks(
+    "aarch64-ptrauth-auth-checks", cl::Hidden,
+    cl::values(clEnumValN(Unchecked, "none", "don't test for failure"),
+               clEnumValN(Poison, "poison", "poison on failure"),
+               clEnumValN(Trap, "trap", "trap on failure")),
+    cl::desc("Check pointer authentication auth/resign failures"),
+    cl::init(Default));
+
 #define DEBUG_TYPE "asm-printer"
 
 namespace {
@@ -124,6 +133,12 @@ class AArch64AsmPrinter : public AsmPrinter {
 
   void emitSled(const MachineInstr &MI, SledKind Kind);
 
+  // Emit the sequence for AUT or AUTPAC.
+  void emitPtrauthAuthResign(const MachineInstr *MI);
+  // Emit the sequence to compute a discriminator into x17, or reuse AddrDisc.
+  unsigned emitPtrauthDiscriminator(uint16_t Disc, unsigned AddrDisc,
+                                    unsigned &InstsEmitted);
+
   /// tblgen'erated driver function for lowering simple MI->MC
   /// pseudo instructions.
   bool emitPseudoExpansionLowering(MCStreamer &OutStreamer,
@@ -1464,6 +1479,256 @@ void AArch64AsmPrinter::emitFMov0(const MachineInstr &MI) {
   }
 }
 
+unsigned AArch64AsmPrinter::emitPtrauthDiscriminator(uint16_t Disc,
+                                                     unsigned AddrDisc,
+                                                     unsigned &InstsEmitted) {
+  // If there is no constant discriminator, there's no blend involved:
+  // just use the address discriminator register as-is (XZR or not).
+  if (!Disc)
+    return AddrDisc;
+
+  // If there's only a constant discriminator, MOV it into x17.
+  if (AddrDisc == AArch64::XZR) {
+    EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::MOVZXi)
+                                     .addReg(AArch64::X17)
+                                     .addImm(Disc)
+                                     .addImm(/*shift=*/0));
+    ++InstsEmitted;
+    return AArch64::X17;
+  }
+
+  // If there are both, emit a blend into x17.
+  EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::ORRXrs)
+                                   .addReg(AArch64::X17)
+                                   .addReg(AArch64::XZR)
+                                   .addReg(AddrDisc)
+                                   .addImm(0));
+  ++InstsEmitted;
+  EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::MOVKXi)
+                                   .addReg(AArch64::X17)
+                                   .addReg(AArch64::X17)
+                                   .addImm(Disc)
+                                   .addImm(/*shift=*/48));
+  ++InstsEmitted;
+  return AArch64::X17;
+}
+
+void AArch64AsmPrinter::emitPtrauthAuthResign(const MachineInstr *MI) {
+  unsigned InstsEmitted = 0;
+  const bool IsAUTPAC = MI->getOpcode() == AArch64::AUTPAC;
+
+  // We can expand AUT/AUTPAC into 3 possible sequences:
+  // - unchecked:
+  //      autia x16, x0
+  //      pacib x16, x1 ; if AUTPAC
+  //
+  // - checked and clearing:
+  //      mov x17, x0
+  //      movk x17, #disc, lsl #48
+  //      autia x16, x17
+  //      mov x17, x16
+  //      xpaci x17
+  //      cmp x16, x17
+  //      b.eq Lsuccess
+  //      mov x16, x17
+  //      b Lend
+  //     Lsuccess:
+  //      mov x17, x1
+  //      movk x17, #disc, lsl #48
+  //      pacib x16, x17
+  //     Lend:
+  //   Where we only emit the AUT if we started with an AUT.
+  //
+  // - checked and trapping:
+  //      mov x17, x0
+  //      movk x17, #disc, lsl #48
+  //      autia x16, x0
+  //      mov x17, x16
+  //      xpaci x17
+  //      cmp x16, x17
+  //      b.eq Lsuccess
+  //      brk #<0xc470 + aut key>
+  //     Lsuccess:
+  //      mov x17, x1
+  //      movk x17, #disc, lsl #48
+  //      pacib x16, x17 ; if AUTPAC
+  //   Where the b.eq skips over the trap if the PAC is valid.
+  //
+  // This sequence is expensive, but we need more information to be able to
+  // do better.
+  //
+  // We can't TBZ the poison bit because EnhancedPAC2 XORs the PAC bits
+  // on failure.
+  // We can't TST the PAC bits because we don't always know how the address
+  // space is setup for the target environment (and the bottom PAC bit is
+  // based on that).
+  // Either way, we also don't always know whether TBI is enabled or not for
+  // the specific target environment.
+
+  // By default, auth/resign sequences check for auth failures.
+  bool ShouldCheck = true;
+  // In the checked sequence, we only trap if explicitly requested.
+  bool ShouldTrap = MF->getFunction().hasFnAttribute("ptrauth-auth-traps");
+
+  // On an FPAC CPU, you get traps whether you want them or not: there's
+  // no point in emitting checks or traps.
+  if (STI->hasFPAC())
+    ShouldCheck = ShouldTrap = false;
+
+  // However, command-line flags can override this, for experimentation.
+  switch (PtrauthAuthChecks) {
+  case PtrauthCheckMode::Default:
+    break;
+  case PtrauthCheckMode::Unchecked:
+    ShouldCheck = ShouldTrap = false;
+    break;
+  case PtrauthCheckMode::Poison:
+    ShouldCheck = true;
+    ShouldTrap = false;
+    break;
+  case PtrauthCheckMode::Trap:
+    ShouldCheck = ShouldTrap = true;
+    break;
+  }
+
+  auto AUTKey = (AArch64PACKey::ID)MI->getOperand(0).getImm();
+  uint64_t AUTDisc = MI->getOperand(1).getImm();
+  unsigned AUTAddrDisc = MI->getOperand(2).getReg();
+
+  unsigned XPACOpc = getXPACOpcodeForKey(AUTKey);
+
+  // Compute aut discriminator into x17
+  assert(isUInt<16>(AUTDisc));
+  unsigned AUTDiscReg =
+      emitPtrauthDiscriminator(AUTDisc, AUTAddrDisc, InstsEmitted);
+  bool AUTZero = AUTDiscReg == AArch64::XZR;
+  unsigned AUTOpc = getAUTOpcodeForKey(AUTKey, AUTZero);
+
+  //  autiza x16      ; if  AUTZero
+  //  autia x16, x17  ; if !AUTZero
+  MCInst AUTInst;
+  AUTInst.setOpcode(AUTOpc);
+  AUTInst.addOperand(MCOperand::createReg(AArch64::X16));
+  AUTInst.addOperand(MCOperand::createReg(AArch64::X16));
+  if (!AUTZero)
+    AUTInst.addOperand(MCOperand::createReg(AUTDiscReg));
+  EmitToStreamer(*OutStreamer, AUTInst);
+  ++InstsEmitted;
+
+  // Unchecked or checked-but-non-trapping AUT is just an "AUT": we're done.
+  if (!IsAUTPAC && (!ShouldCheck || !ShouldTrap)) {
+    assert(STI->getInstrInfo()->getInstSizeInBytes(*MI) >= InstsEmitted * 4);
+    return;
+  }
+
+  MCSymbol *EndSym = nullptr;
+
+  // Checked sequences do an additional strip-and-compare.
+  if (ShouldCheck) {
+    MCSymbol *SuccessSym = createTempSymbol("auth_success_");
+
+    // XPAC has tied src/dst: use x17 as a temporary copy.
+    //  mov x17, x16
+    EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::ORRXrs)
+                                     .addReg(AArch64::X17)
+                                     .addReg(AArch64::XZR)
+                                     .addReg(AArch64::X16)
+                                     .addImm(0));
+    ++InstsEmitted;
+
+    //  xpaci x17
+    EmitToStreamer(
+        *OutStreamer,
+        MCInstBuilder(XPACOpc).addReg(AArch64::X17).addReg(AArch64::X17));
+    ++InstsEmitted;
+
+    //  cmp x16, x17
+    EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::SUBSXrs)
+                                     .addReg(AArch64::XZR)
+                                     .addReg(AArch64::X16)
+                                     .addReg(AArch64::X17)
+                                     .addImm(0));
+    ++InstsEmitted;
+
+    //  b.eq Lsuccess
+    EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::Bcc)
+                                     .addImm(AArch64CC::EQ)
+                                     .addExpr(MCSymbolRefExpr::create(
+                                         SuccessSym, OutContext)));
+    ++InstsEmitted;
+
+    if (ShouldTrap) {
+      // Trapping sequences do a 'brk'.
+      //  brk #<0xc470 + aut key>
+      EmitToStreamer(*OutStreamer,
+                     MCInstBuilder(AArch64::BRK).addImm(0xc470 | AUTKey));
+      ++InstsEmitted;
+    } else {
+      // Non-trapping checked sequences return the stripped result in x16,
+      // skipping over the PAC if there is one.
+
+      // FIXME: can we simply return the AUT result, already in x16? without..
+      //        ..traps this is usable as an oracle anyway, based on high bits
+      //  mov x17, x16
+      EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::ORRXrs)
+                                       .addReg(AArch64::X16)
+                                       .addReg(AArch64::XZR)
+                                       .addReg(AArch64::X17)
+                                       .addImm(0));
+      ++InstsEmitted;
+
+      if (IsAUTPAC) {
+        EndSym = createTempSymbol("resign_end_");
+
+        //  b Lend
+        EmitToStreamer(*OutStreamer, MCInstBuilder(AArch64::B)
+                                         .addExpr(MCSymbolRefExpr::create(
+                                             EndSym, OutContext)));
+        ++InstsEmitted;
+      }
+    }
+
+    // If the auth check succeeds, we can continue.
+    // Lsuccess:
+    OutStreamer->emitLabel(SuccessSym);
+  }
+
+  // We already emitted unchecked and checked-but-non-trapping AUTs.
+  // That left us with trapping AUTs, and AUTPACs.
+  // Trapping AUTs don't need PAC: we're done.
+  if (!IsAUTPAC) {
+    assert(STI->getInstrInfo()->getInstSizeInBytes(*MI) >= InstsEmitted * 4);
+    return;
+  }
+
+  auto PACKey = (AArch64PACKey::ID)MI->getOperand(3).getImm();
+  uint64_t PACDisc = MI->getOperand(4).getImm();
+  unsigned PACAddrDisc = MI->getOperand(5).getReg();
+
+  // Compute pac discriminator into x17
+  assert(isUInt<16>(PACDisc));
+  unsigned PACDiscReg =
+      emitPtrauthDiscriminator(PACDisc, PACAddrDisc, InstsEmitted);
+  bool PACZero = PACDiscReg == AArch64::XZR;
+  unsigned PACOpc = getPACOpcodeForKey(PACKey, PACZero);
+
+  //  pacizb x16      ; if  PACZero
+  //  pacib x16, x17  ; if !PACZero
+  MCInst PACInst;
+  PACInst.setOpcode(PACOpc);
+  PACInst.addOperand(MCOperand::createReg(AArch64::X16));
+  PACInst.addOperand(MCOperand::createReg(AArch64::X16));
+  if (!PACZero)
+    PACInst.addOperand(MCOperand::createReg(PACDiscReg));
+  EmitToStreamer(*OutStreamer, PACInst);
+  ++InstsEmitted;
+
+  assert(STI->getInstrInfo()->getInstSizeInBytes(*MI) >= InstsEmitted * 4);
+  //  Lend:
+  if (EndSym)
+    OutStreamer->emitLabel(EndSym);
+}
+
 // Simple pseudo-instructions have their lowering (with expansion to real
 // instructions) auto-generated.
 #include "AArch64GenMCPseudoLowering.inc"
@@ -1599,6 +1864,11 @@ void AArch64AsmPrinter::emitInstruction(const MachineInstr *MI) {
     return;
   }
 
+  case AArch64::AUT:
+  case AArch64::AUTPAC:
+    emitPtrauthAuthResign(MI);
+    return;
+
   // Tail calls use pseudo instructions so they have the proper code-gen
   // attributes (isCall, isReturn, etc.). We lower them to the real
   // instruction here.
diff --git a/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
index 163ed520a8a677..42870a594a5cc5 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
@@ -363,6 +363,9 @@ class AArch64DAGToDAGISel : public SelectionDAGISel {
 
   bool tryIndexedLoad(SDNode *N);
 
+  void SelectPtrauthAuth(SDNode *N);
+  void SelectPtrauthResign(SDNode *N);
+
   bool trySelectStackSlotTagP(SDNode *N);
   void SelectTagP(SDNode *N);
 
@@ -1460,6 +1463,96 @@ void AArch64DAGToDAGISel::SelectTable(SDNode *N, unsigned NumVecs, unsigned Opc,
   ReplaceNode(N, CurDAG->getMachineNode(Opc, dl, VT, Ops));
 }
 
+static std::tuple<SDValue, SDValue>
+extractPtrauthBlendDiscriminators(SDValue Disc, SelectionDAG *DAG) {
+  SDLoc DL(Disc);
+  SDValue AddrDisc;
+  SDValue ConstDisc;
+
+  // If this is a blend, remember the constant and address discriminators.
+  // Otherwise, it's either a constant discriminator, or a non-blended
+  // address discriminator.
+  if (Disc->getOpcode() == ISD::INTRINSIC_WO_CHAIN &&
+      Disc->getConstantOperandVal(0) == Intrinsic::ptrauth_blend) {
+    AddrDisc = Disc->getOperand(1);
+    ConstDisc = Disc->getOperand(2);
+  } else {
+    ConstDisc = Disc;
+  }
+
+  // If the constant discriminator (either the blend RHS, or the entire
+  // discriminator value) isn't a 16-bit constant, bail out, and let the
+  // discriminator be computed separately.
+  auto *ConstDiscN = dyn_cast<ConstantSDNode>(ConstDisc);
+  if (!ConstDiscN || !isUInt<16>(ConstDiscN->getZExtValue()))
+    return std::make_tuple(DAG->getTargetConstant(0, DL, MVT::i64), Disc);
+
+  // If there's no address discriminator, use XZR directly.
+  if (!AddrDisc)
+    AddrDisc = DAG->getRegister(AArch64::XZR, MVT::i64);
+
+  return std::make_tuple(
+      DAG->getTargetConstant(ConstDiscN->getZExtValue(), DL, MVT::i64),
+      AddrDisc);
+}
+
+void AArch64DAGToDAGISel::SelectPtrauthAuth(SDNode *N) {
+  SDLoc DL(N);
+  // IntrinsicID is operand #0
+  SDValue Val = N->getOperand(1);
+  SDValue AUTKey = N->getOperand(2);
+  SDValue AUTDisc = N->getOperand(3);
+
+  unsigned AUTKeyC = cast<ConstantSDNode>(AUTKey)->getZExtValue();
+  AUTKey = CurDAG->getTargetConstant(AUTKeyC, DL, MVT::i64);
+
+  SDValue AUTAddrDisc, AUTConstDisc;
+  std::tie(AUTConstDisc, AUTAddrDisc) =
+      extractPtrauthBlendDiscriminators(AUTDisc, CurDAG);
+
+  SDValue X16Copy = CurDAG->getCopyToReg(CurDAG->getEntryNode(), DL,
+                                         AArch64::X16, Val, SDValue());
+  SDValue Ops[] = {AUTKey, AUTConstDisc, AUTAddrDisc, X16Copy.getValue(1)};
+
+  SDNode *AUT = CurDAG->getMachineNode(AArch64::AUT, DL, MVT::i64, Ops);
+  ReplaceNode(N, AUT);
+  return;
+}
+
+void AArch64DAGToDAGISel::SelectPtrauthResign(SDNode *N) {
+  SDLoc DL(N);
+  // IntrinsicID is operand #0
+  SDValue Val = N->getOperand(1);
+  SDValue AUTKey = N->getOperand(2);
+  SDValue AUTDisc = N->getOperand(3);
+  SDValue PACKey = N->getOperand(4);
+  SDValue PACDisc = N->getOperand(5);
+
+  unsigned AUTKeyC = cast<ConstantSDNode>(AUTKey)->getZExtValue();
+  unsigned PACKeyC = cast<ConstantSDNode>(PACKey)->getZExtValue();
+
+  AUTKey = CurDAG->getTargetConstant(AUTKeyC, DL, MVT::i64);
+  PACKey = CurDAG->getTargetConstant(PACKeyC, DL, MVT::i64);
+
+  SDValue AUTAddrDisc, AUTConstDisc;
+  std::tie(AUTConstDisc, AUTAddrDisc) =
+      extractPtrauthBlendDiscriminators(AUTDisc, CurDAG);
+
+  SDValue PACAddrDisc, PACConstDisc;
+  std::tie(PACConstDisc, PACAddrDisc) =
+      extractPtrauthBlendDiscriminators(PACDisc, CurDAG);
+
+  SDValue X16Copy = CurDAG->getCopyToReg(CurDAG->getEntryNode(), DL,
+                                         AArch64::X16, Val, SDValue());
+
+  SDValue Ops[] = {AUTKey,       AUTConstDisc, AUTAddrDisc,        PACKey,
+                   PACConstDisc, PACAddrDisc,  X16Copy.getValue(1)};
+
+  SDNode *AUTPAC = CurDAG->getMachineNode(AArch64::AUTPAC, DL, MVT::i64, Ops);
+  ReplaceNode(N, AUTPAC);
+  return;
+}
+
 bool AArch64DAGToDAGISel::tryIndexedLoad(SDNode *N) {
   LoadSDNode *LD = cast<LoadSDNode>(N);
   if (LD->isUnindexed())
@@ -5233,6 +5326,15 @@ void AArch64DAGToDAGISel::Select(SDNode *Node) {
     case Intrinsic::aarch64_tagp:
       SelectTagP(Node);
       return;
+
+    case Intrinsic::ptrauth_auth:
+      SelectPtrauthAuth(Node);
+      return;
+
+    case Intrinsic::ptrauth_resign:
+      SelectPtrauthResign(Node);
+      return;
+
     case Intrinsic::aarch64_neon_tbl2:
       SelectTable(Node, 2,
                   VT == MVT::v8i8 ? AArch64::TBLv8i8Two : AArch64::TBLv16i8Two,
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index b4b975cce007ac..cd8f5a982fd4dc 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -262,6 +262,8 @@ def HasMatMulFP32    : Predicate<"Subtarget->hasMatMulFP32()">,
                        AssemblerPredicateWithAll<(all_of FeatureMatMulFP32), "f32mm">;
 def HasMatMulFP64    : Predicate<"Subtarget->hasMatMulFP64()">,
                        AssemblerPredicateWithAll<(all_of FeatureMatMulFP64), "f64mm">;
+def HasFPAC          : Predicate<"Subtarget->hasFPAC())">,
+                       AssemblerPredicateWithAll<(all_of FeatureFPAC), "fpac">;
 def HasXS            : Predicate<"Subtarget->hasXS()">,
                        AssemblerPredicateWithAll<(all_of FeatureXS), "xs">;
 def HasWFxT          : Predicate<"Subtarget->hasWFxT()">,
@@ -1689,6 +1691,36 @@ let Predicates = [HasPAuth] in {
   defm LDRAA  : AuthLoad<0, "ldraa", simm10Scaled>;
   defm LDRAB  : AuthLoad<1, "ldrab", simm10Scaled>;
 
+  // AUT pseudo.
+  // This directly manipulates x16/x17, which are the only registers the OS
+  // guarantees are safe to use for sensitive operations.
+  def AUT : Pseudo<(outs), (ins i32imm:$Key, i64imm:$Disc, GPR64noip:$AddrDisc),
+                   []>, Sched<[WriteI, ReadI]> {
+    let isCodeGenOnly = 1;
+    let hasSideEffects = 1;
+    let mayStore = 0;
+    let mayLoad = 0;
+    let Size = 32;
+    let Defs = [X16,X17,NZCV];
+    let Uses = [X16];
+  }
+
+  // AUT and re-PAC a value, using different keys/data.
+  // This directly manipulates x16/x17, which are the only registers the OS
+  // guarantees are safe to use for sensitive operations.
+  def AUTPAC
+      : Pseudo<(outs),
+               (ins i32imm:$AUTKey, i64imm:$AUTDisc, GPR64noip:$AUTAddrDisc,
+                    i32imm:$PACKey, i64imm:$PACDisc, GPR64noip:$PACAddrDisc),
+               []>, Sched<[WriteI, ReadI]> {
+    let isCodeGenOnly = 1;
+    let hasSideEffects = 1;
+    let mayStore = 0;
+    let mayLoad = 0;
+    let Size = 48;
+    let Defs = [X16,X17,NZCV];
+    let Uses = [X16];
+  }
 }
 
 // v9.5-A pointer authentication extensions
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.cpp b/llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.cpp
index 92db89cc0915b8..d24d2c42634b3a 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.cpp
@@ -96,6 +96,35 @@ bool AArch64GISelUtils::tryEmitBZero(MachineInstr &MI,
   return true;
 }
 
+std::tuple<uint16_t, Register>
+AArch64GISelUtils::extractPtrauthBlendDiscriminators(Register Disc,
+                                                     MachineRegisterInfo &MRI) {
+  Register AddrDisc = Disc;
+  uint16_t ConstDisc = 0;
+
+  if (auto ConstDiscVal = getIConstantVRegVal(Disc, MRI)) {
+    if (isUInt<16>(ConstDiscVal->getZExtValue())) {
+      ConstDisc = ConstDiscVal->getZExtValue();
+      AddrDisc = AArch64::XZR;
+    }
+    return std::make_tuple(ConstDisc, AddrDisc);
+  }
+
+  auto *DiscMI = MRI.getVRegDef(Disc);
+  if (!DiscMI || DiscMI->getOpcode() != TargetOpcode::G_INTRINSIC ||
+      DiscMI->getOperand(1).getIntrinsicID() != Intrinsic::ptrauth_blend)
+    return std::make_tuple(ConstDisc, AddrDisc);
+
+  if (auto ConstDiscVal =
+          getIConstantVRegVal(DiscMI->getOperand(3).getReg(), MRI)) {
+    if (isUInt<16>(ConstDiscVal->getZExtValue())) {
+      ConstDisc = ConstDiscVal->getZExtValue();
+      AddrDisc = DiscMI->getOperand(2).getReg();
+    }
+  }
+  return std::make_tuple(ConstDisc, AddrDisc);
+}
+
 void AArch64GISelUtils::changeFCMPPredToAArch64CC(
     const CmpInst::Predicate P, AArch64CC::CondCode &CondCode,
     AArch64CC::CondCode &CondCode2) {
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.h b/llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.h
index 791db7efaf0bee..9ef833f0fc0ca1 100644
--- a/llvm/lib/Target/AArch64/GISel...
[truncated]

@asl
Copy link
Collaborator

asl commented Jul 19, 2024

@ahmedbougacha Can this be rebased? Looks like the conflicts are trivial to resolve. As without these changes we cannot have backend support for auth and resign intrinsics at all ;) And we all agreed during one of syncs ~month ago, that we can merge this as-is and then follow up and the different possible implementations

This introduces 3 hardening modes in the authentication step of
auth/resign lowering:
- unchecked, which uses the AUT instructions as-is
- poison, which detects authentication failure (using an XPAC+CMP
  sequence), explicitly yielding the XPAC result rather than the
  AUT result, to avoid leaking
- trap, which additionally traps on authentication failure,
  using BRK #0xC470 + key (IA C470, IB C471, DA C472, DB C473.)

Not all modes are necessarily useful in all contexts, and there
are more performant alternative lowerings in specific contexts
(e.g., when I/D TBI enablement is a target ABI guarantee.)

This is controlled by the `ptrauth-auth-traps` function attributes,
and can be overridden using `-aarch64-ptrauth-auth-checks=`.
When the FPAC feature is present, we can rely on its faulting
behavior to avoid emitting the expensive authentication failure
check sequence ourvelves.  In which case we emit the same
sequence as a plain unchecked auth/resign.
@ahmedbougacha ahmedbougacha force-pushed the eng/abougacha/ptrauth-auth-resign-isel branch from 96fa240 to 557cd75 Compare July 23, 2024 02:08
@ahmedbougacha ahmedbougacha merged commit d7e8a74 into llvm:main Jul 23, 2024
4 of 7 checks passed
@ahmedbougacha ahmedbougacha deleted the eng/abougacha/ptrauth-auth-resign-isel branch July 23, 2024 04:28
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Summary:
This introduces 3 hardening modes in the authentication step of
auth/resign lowering:
- unchecked, which uses the AUT instructions as-is
- poison, which detects authentication failure (using an XPAC+CMP
  sequence), explicitly yielding the XPAC result rather than the
  AUT result, to avoid leaking
- trap, which additionally traps on authentication failure,
  using BRK #0xC470 + key (IA C470, IB C471, DA C472, DB C473.)

Not all modes are necessarily useful in all contexts, and there
are more performant alternative lowerings in specific contexts
(e.g., when I/D TBI enablement is a target ABI guarantee.)
These will be implemented separately.

This is controlled by the `ptrauth-auth-traps` function attributes,
and can be overridden using `-aarch64-ptrauth-auth-checks=`.

This also adds the FPAC extension, which we haven't needed
before, to improve isel when we can rely on HW checking.

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251202
kovdan01 added a commit to kovdan01/llvm-project that referenced this pull request Jul 26, 2024
The lowering implementation and tests against arm64e-apple-darwin triple
were added previously in llvm#79024.
kovdan01 added a commit that referenced this pull request Jul 26, 2024
…100744)

The lowering implementation and tests against arm64e-apple-darwin triple
were added previously in #79024.
llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Jul 29, 2024
…lvm#100744)

The lowering implementation and tests against arm64e-apple-darwin triple
were added previously in llvm#79024.

(cherry picked from commit 53283dc)
tru pushed a commit to llvmbot/llvm-project that referenced this pull request Jul 30, 2024
…lvm#100744)

The lowering implementation and tests against arm64e-apple-darwin triple
were added previously in llvm#79024.

(cherry picked from commit 53283dc)
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Sep 23, 2024
add back -fallow-half-arguments-and-returns for hipRuntime builds.

----------------------------------------------------------------------

Revert "[PAC][AArch64] Lower ptrauth constants in code (llvm#96879)"

This reverts commit 88dd10c.

----------------------------------------------------------------------

[PAC][AArch64] Lower ptrauth constants in code (llvm#96879)

This re-applies llvm#94241 after fixing buildbot failure, see
https://lab.llvm.org/buildbot/#/builders/51/builds/570

According to standard, `constexpr` variables and `const` variables
initialized with constant expressions can be used in lambdas w/o
capturing - see https://en.cppreference.com/w/cpp/language/lambda.
However, MSVC used on buildkite seems to ignore that rule and does not
allow using such uncaptured variables in lambdas: we have "error C3493:
'Mask16' cannot be implicitly captured because no default capture mode
has been specified" - see
https://buildkite.com/llvm-project/github-pull-requests/builds/73238

Explicitly capturing such a variable, however, makes buildbot fail with
"error: lambda capture 'Mask16' is not required to be captured for this
use [-Werror,-Wunused-lambda-capture]" - see
https://lab.llvm.org/buildbot/#/builders/51/builds/570.

Fix both cases by using `0xffff` value directly instead of giving a name
to it.

Original PR description below.

Depends on llvm#94240.

Define the following pseudos for lowering ptrauth constants in code:

- non-`extern_weak`:
  - no GOT load needed: `MOVaddrPAC` - similar to `MOVaddr`, with added
PAC;
  - GOT load needed: `LOADgotPAC` - similar to `LOADgot`, with added PAC;
- `extern_weak`: `LOADauthptrstatic` - similar to `LOADgot`, but use a
special stub slot named `sym$auth_ptr$key$disc` filled by dynamic linker
during relocation resolving instead of a GOT slot.

---------

Co-authored-by: Ahmed Bougacha <[email protected]>

(cherry picked from commit 1488fb4)

----------------------------------------------------------------------

[AArch64][PAC] Lower ptrauth constants in code for MachO. (llvm#97665)

This also adds support for auth stubs on MachO using __DATA,__auth_ptr.

Some of the machinery for auth stubs is already implemented;  this
generalizes that a bit to support MachO, and moves some of the shared
logic into MMIImpls.

In particular, this originally had an AuthStubInfo struct, but we no
longer need it beyond a single MCExpr.  So this provides variants of
the symbol stub helper type declarations and functions for "expr
stubs", where a stub points at an arbitrary MCExpr, rather than
a simple MCSymbol (and a bit).

(cherry picked from commit 5f1bb62)

----------------------------------------------------------------------

[AArch64][PAC] Sign block addresses used in indirectbr. (llvm#97647)

Enabled in clang using:
    -fptrauth-indirect-gotos

and at the IR level using function attribute:
    "ptrauth-indirect-gotos"

Signing uses IA and a per-function integer discriminator. The
discriminator isn't ABI-visible, and is currently:
    ptrauth_string_discriminator("<function_name> blockaddress")

A sufficiently sophisticated frontend could benefit from per-indirectbr
discrimination, which would need additional machinery, such as allowing
"ptrauth" bundles on indirectbr. For our purposes, the simple scheme
above is sufficient.

This approach doesn't support subtracting label addresses and using
the result as offsets, because each label address is signed.
Pointer arithmetic on signed pointers corrupts the signature bits,
and because label address expressions aren't typed beyond void*,
we can't do anything reliably intelligent on the arithmetic exprs.
Not signing addresses when used to form offsets would allow
easily hijacking control flow by overwriting the offset.

This diagnoses the basic cases (`&&lbl2 - &&lbl1`) in the frontend,
while we evaluate either alternative implementations (e.g., lowering
blockaddress to a bb number, and indirectbr to a checked jump-table),
or better diagnostics (both at the frontend level and on unencodable
IR constants).

(cherry picked from commit b8721fa)

----------------------------------------------------------------------

[AArch64][PAC] Lower auth/resign into checked sequence. (llvm#79024)

This introduces 3 hardening modes in the authentication step of
auth/resign lowering:
- unchecked, which uses the AUT instructions as-is
- poison, which detects authentication failure (using an XPAC+CMP
  sequence), explicitly yielding the XPAC result rather than the
  AUT result, to avoid leaking
- trap, which additionally traps on authentication failure,
  using BRK #0xC470 + key (IA C470, IB C471, DA C472, DB C473.)

Not all modes are necessarily useful in all contexts, and there
are more performant alternative lowerings in specific contexts
(e.g., when I/D TBI enablement is a target ABI guarantee.)
These will be implemented separately.

This is controlled by the `ptrauth-auth-traps` function attributes,
and can be overridden using `-aarch64-ptrauth-auth-checks=`.

This also adds the FPAC extension, which we haven't needed
before, to improve isel when we can rely on HW checking.

(cherry picked from commit d7e8a74)

----------------------------------------------------------------------

[Clang][Arm] Convert -fallow-half-arguments-and-returns to a target option. NFC

This cc1 option -fallow-half-arguments-and-returns allows __fp16 to be
passed by argument and returned, without giving an error. It is
currently always enabled for Arm and AArch64, by forcing the option in
the driver. This means any cc1 tests (especially those needing
arm_neon.h) need to specify the option too, to prevent the error from
being emitted.

This changes it to a target option instead, set to true for Arm and
AArch64. This allows the option to be removed. Previously it was implied
by -fnative_half_arguments_and_returns, which is set for certain
languages like open_cl, renderscript and hlsl, so that option now too
controls the errors. There were are few other non-arm uses of
-fallow-half-arguments-and-returns but I believe they were unnecessary.
The strictfp_builtins.c tests were converted from __fp16 to _Float16 to
avoid the issues.

Differential Revision: https://reviews.llvm.org/D133885

(cherry picked from commit 9ef11036505c0ae6cdb56ff49f39ab7abcded3cf)

----------------------------------------------------------------------

[clang] XFAIL a few tests due to 'noundef' etc

Not all, but most of these are failing due to the presence of a 'noundef' call
return attribute on some intrinsics.

This is not present on upstream 'main' due to the AlwaysInliner pass being run.
See commit 1a2e77c.

----------------------------------------------------------------------

[DebugInfo] Restore missing disabled ptrauth support

See "[DebugInfo] Teach LLVM and LLDB about ptrauth in DWARF":

    commit a8c3d98
    Author: Jonas Devlieghere <[email protected]>
    Date:   Wed Jul 27 10:44:15 2022 -0700

----------------------------------------------------------------------

Apply simple-do.ll test change from b46c085

----------------------------------------------------------------------

Adjust ptrauth.s test for ptrauth_authentication_mode encoding

----------------------------------------------------------------------

Fix dwarf-eh-prepare-dbg.ll test: dwarfAddressSpace=>addressSpace

----------------------------------------------------------------------

Update some SLPVectorizer/AArch64 tests from upstream

----------------------------------------------------------------------

Regenerate assertions in arm_mult_q15.ll

----------------------------------------------------------------------

[AsmPrinter] Handle null extracted addr class

----------------------------------------------------------------------

[PowerPC] Account for custom LLVM moniker in aix tests

----------------------------------------------------------------------

[LoongArch] Add "Verify Heterogeneous Debug Preconditions" to pipeline test

----------------------------------------------------------------------

[JITLink][RISCV] Un-XFAIL ELF_pc_indirect.s

----------------------------------------------------------------------

Change-Id: Ie6ab500b2451b3ed070dfad0bc16d003e5e2fe10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants