Skip to content

[PAC][lld][ELF] Use PAC instructions in PLT header with -z pac-plt #116334

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kovdan01
Copy link
Contributor

  1. Sign return address before storing into stack.
  2. Treat lazy resolver address as signed.

@kovdan01
Copy link
Contributor Author

Both changes from the PR description require dynamic loader support, which seems missing for glibc (see discussion in https://discord.com/channels/636084430946959380/1133112394701348895/1298121731428585564). However, I don't think that it should prevent us from supporting this in lld.

@kovdan01 kovdan01 marked this pull request as ready for review November 15, 2024 08:32
@kovdan01 kovdan01 changed the title [PAC][lld] Use PAC instructions in PLT header with -z pac-plt [PAC][lld][ELF] Use PAC instructions in PLT header with -z pac-plt Nov 15, 2024
};
const uint8_t pacBr[] = {
0x30, 0x0a, 0x1f, 0xd7, // braa x17, x16
0x1f, 0x20, 0x03, 0xd5 // nop
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need extra nop here? And below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asl The plt header size was aligned to 16, and this patch preserves this invariant. We can delete nops from pacBr and stdBr, but in this case we'll need to add them manually later. So, current variant should probably be OK

@llvmbot
Copy link
Member

llvmbot commented Nov 16, 2024

@llvm/pr-subscribers-lld-elf

@llvm/pr-subscribers-lld

Author: Daniil Kovalev (kovdan01)

Changes
  1. Sign return address before storing into stack.
  2. Treat lazy resolver address as signed.

Full diff: https://github.com/llvm/llvm-project/pull/116334.diff

4 Files Affected:

  • (modified) lld/ELF/Arch/AArch64.cpp (+28-3)
  • (modified) lld/test/ELF/aarch64-feature-btipac.s (+2-2)
  • (modified) lld/test/ELF/aarch64-feature-pac.s (+3-3)
  • (modified) lld/test/ELF/aarch64-feature-pauth.s (+5-3)
diff --git a/lld/ELF/Arch/AArch64.cpp b/lld/ELF/Arch/AArch64.cpp
index c64a0dbe8bd62b..f2855576ac2ca9 100644
--- a/lld/ELF/Arch/AArch64.cpp
+++ b/lld/ELF/Arch/AArch64.cpp
@@ -1040,13 +1040,23 @@ AArch64BtiPac::AArch64BtiPac(Ctx &ctx) : AArch64(ctx) {
 
 void AArch64BtiPac::writePltHeader(uint8_t *buf) const {
   const uint8_t btiData[] = { 0x5f, 0x24, 0x03, 0xd5 }; // bti c
+  const uint8_t signLR[] = {0x7f, 0x23, 0x03, 0xd5};    // pacibsp
   const uint8_t pltData[] = {
       0xf0, 0x7b, 0xbf, 0xa9, // stp    x16, x30, [sp,#-16]!
       0x10, 0x00, 0x00, 0x90, // adrp   x16, Page(&(.got.plt[2]))
       0x11, 0x02, 0x40, 0xf9, // ldr    x17, [x16, Offset(&(.got.plt[2]))]
       0x10, 0x02, 0x00, 0x91, // add    x16, x16, Offset(&(.got.plt[2]))
-      0x20, 0x02, 0x1f, 0xd6, // br     x17
-      0x1f, 0x20, 0x03, 0xd5, // nop
+  };
+  const uint8_t pacHintBr[] = {
+      0x9f, 0x21, 0x03, 0xd5, // autia1716
+      0x20, 0x02, 0x1f, 0xd6  // br   x17
+  };
+  const uint8_t pacBr[] = {
+      0x30, 0x0a, 0x1f, 0xd7, // braa x17, x16
+      0x1f, 0x20, 0x03, 0xd5  // nop
+  };
+  const uint8_t stdBr[] = {
+      0x20, 0x02, 0x1f, 0xd6, // br   x17
       0x1f, 0x20, 0x03, 0xd5  // nop
   };
   const uint8_t nopData[] = { 0x1f, 0x20, 0x03, 0xd5 }; // nop
@@ -1061,15 +1071,30 @@ void AArch64BtiPac::writePltHeader(uint8_t *buf) const {
     buf += sizeof(btiData);
     plt += sizeof(btiData);
   }
+  if (pacEntryKind != PEK_NoAuth) {
+    memcpy(buf, signLR, sizeof(signLR));
+    buf += sizeof(signLR);
+    plt += sizeof(signLR);
+  }
   memcpy(buf, pltData, sizeof(pltData));
 
   relocateNoSym(buf + 4, R_AARCH64_ADR_PREL_PG_HI21,
                 getAArch64Page(got + 16) - getAArch64Page(plt + 4));
   relocateNoSym(buf + 8, R_AARCH64_LDST64_ABS_LO12_NC, got + 16);
   relocateNoSym(buf + 12, R_AARCH64_ADD_ABS_LO12_NC, got + 16);
+
+  if (pacEntryKind != PEK_NoAuth)
+    memcpy(buf + sizeof(pltData),
+           (pacEntryKind == PEK_AuthHint ? pacHintBr : pacBr),
+           sizeof(pacEntryKind == PEK_AuthHint ? pacHintBr : pacBr));
+  else
+    memcpy(buf + sizeof(pltData), stdBr, sizeof(stdBr));
   if (!btiHeader)
     // We didn't add the BTI c instruction so round out size with NOP.
-    memcpy(buf + sizeof(pltData), nopData, sizeof(nopData));
+    memcpy(buf + sizeof(pltData) + sizeof(stdBr), nopData, sizeof(nopData));
+  if (pacEntryKind == PEK_NoAuth)
+    // We didn't add the PACIBSP instruction so round out size with NOP.
+    memcpy(buf + sizeof(pltData) + sizeof(stdBr), nopData, sizeof(nopData));
 }
 
 void AArch64BtiPac::writePlt(uint8_t *buf, const Symbol &sym,
diff --git a/lld/test/ELF/aarch64-feature-btipac.s b/lld/test/ELF/aarch64-feature-btipac.s
index 3b3c1744370362..caba4862059e8b 100644
--- a/lld/test/ELF/aarch64-feature-btipac.s
+++ b/lld/test/ELF/aarch64-feature-btipac.s
@@ -155,13 +155,13 @@ func1:
 # BTIPACEX2: Disassembly of section .plt:
 # BTIPACEX2: 0000000000210380 <.plt>:
 # BTIPACEX2-NEXT:   210380:              bti     c
+# BTIPACEX2-NEXT:                        pacibsp
 # BTIPACEX2-NEXT:                        stp     x16, x30, [sp, #-16]!
 # BTIPACEX2-NEXT:                        adrp    x16, 0x230000
 # BTIPACEX2-NEXT:                        ldr     x17, [x16, #1208]
 # BTIPACEX2-NEXT:                        add     x16, x16, #1208
+# BTIPACEX2-NEXT:                        autia1716
 # BTIPACEX2-NEXT:                        br      x17
-# BTIPACEX2-NEXT:                        nop
-# BTIPACEX2-NEXT:                        nop
 # BTIPACEX2: 00000000002103a0 <func2@plt>:
 # BTIPACEX2-NEXT:   2103a0:              adrp    x16, 0x230000
 # BTIPACEX2-NEXT:                        ldr     x17, [x16, #1216]
diff --git a/lld/test/ELF/aarch64-feature-pac.s b/lld/test/ELF/aarch64-feature-pac.s
index 4fd1fd2acea737..e2b385a61ef448 100644
--- a/lld/test/ELF/aarch64-feature-pac.s
+++ b/lld/test/ELF/aarch64-feature-pac.s
@@ -96,14 +96,14 @@
 # PACPLT-NEXT:   210378:        ret
 # PACPLT: Disassembly of section .plt:
 # PACPLT: 0000000000210380 <.plt>:
-# PACPLT-NEXT:   210380:        stp     x16, x30, [sp, #-16]!
+# PACPLT-NEXT:   210380:        pacibsp
+# PACPLT-NEXT:                  stp     x16, x30, [sp, #-16]!
 # PACPLT-NEXT:                  adrp    x16, 0x230000
 # PACPLT-NEXT:                  ldr     x17, [x16, #1192]
 # PACPLT-NEXT:                  add     x16, x16, #1192
+# PACPLT-NEXT:                  autia1716
 # PACPLT-NEXT:                  br      x17
 # PACPLT-NEXT:                  nop
-# PACPLT-NEXT:                  nop
-# PACPLT-NEXT:                  nop
 # PACPLT: 00000000002103a0 <func2@plt>:
 # PACPLT-NEXT:   2103a0:        adrp    x16, 0x230000
 # PACPLT-NEXT:                  ldr     x17, [x16, #1200]
diff --git a/lld/test/ELF/aarch64-feature-pauth.s b/lld/test/ELF/aarch64-feature-pauth.s
index c50f38f4a7c975..b8b0baa50051b7 100644
--- a/lld/test/ELF/aarch64-feature-pauth.s
+++ b/lld/test/ELF/aarch64-feature-pauth.s
@@ -65,13 +65,15 @@
 # PACPLT-NEXT:     ret
 # PACPLT: Disassembly of section .plt:
 # PACPLT:      <.plt>:
+# PACPLT-NEXT:     pacibsp
 # PACPLT-NEXT:     stp     x16, x30, [sp, #-0x10]!
 # PACPLT-NEXT:     adrp    x16, 0x30000 <func3+0x30000>
 # PACPLT-NEXT:     ldr     x17, [x16, #0x[[B]]]
 # PACPLT-NEXT:     add     x16, x16, #0x[[B]]
-# PACPLT-NEXT:     br      x17
-# PACPLT-NEXT:     nop
-# PACPLT-NEXT:     nop
+# NOHINT-NEXT:     braa    x17, x16
+# NOHINT-NEXT:     nop
+# HINT-NEXT:       autia1716
+# HINT-NEXT:       br      x17
 # PACPLT-NEXT:     nop
 # PACPLT:      <func3@plt>:
 # PACPLT-NEXT:     adrp    x16, 0x30000 <func3+0x30000>

@MaskRay
Copy link
Member

MaskRay commented Nov 17, 2024

What is the compatibility story about pacibsp and does GNU ld (which supports the option as well) decides to do something similar?

Base automatically changed from users/kovdan01/lld-pac-plt-braa to main November 18, 2024 05:07
1. Sign return address before storing into stack.
2. Treat lazy resolver address as signed.
@kovdan01 kovdan01 force-pushed the users/kovdan01/pac-instr-in-plt-header branch from f3584a0 to 10caccb Compare November 18, 2024 05:09
@kovdan01 kovdan01 self-assigned this Nov 18, 2024
@asl
Copy link
Collaborator

asl commented Mar 25, 2025

What is the compatibility story about pacibsp and does GNU ld (which supports the option as well) decides to do something similar?

Tagging @smithp35

My understanding is that pac-plt changes never made it to glibc, right?

@smithp35
Copy link
Collaborator

smithp35 commented Mar 25, 2025

What is the compatibility story about pacibsp and does GNU ld (which supports the option as well) decides to do something similar?

Tagging @smithp35

My understanding is that pac-plt changes never made it to glibc, right?

What I know:
The glibc PAC support never landed (https://bugzilla.redhat.com/show_bug.cgi?id=1764581) with original discussion https://public-inbox.org/libc-alpha/[email protected]/

My summary is that there were some concerns about backwards compatibility with previous glibc versions, and as RELRO was available on glibc platforms, pac-plt was deemed uneccessary.

My understanding is that GNU ld chose not to protect the lazy loader due to how RELRO is implemented in GNU ld

  . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .);
  .got.plt        : { *(.got.plt) *(.igot.plt) }

The offset in DATA_SEGMENT_RELRO_END (ignored in LLD) places the first two .got.plt entries into RELRO so they don't need to be protected by pac-plt. The start and placement of the RELRO segment is calculated "in reverse" by pinning the end of the RELRO segment at DATA_SEGMENT_RELRO_END(offset) then working out where its start address should be from there.

My understanding is that lld is unlikely to place its RELRO in such a way that we can guarantee an overlap the first two entries of the .got.plt.

So we're in the awkward situation that:

  • -z pac-plt with lazy loading has a major gap on lld (lazy loading slot unprotected).
  • Fixing lazy loading on LLD makes us incompatible with -z pac-plt on GNU ld. It isn't implemented in glibc, and I don't expect it will be.

I think that if we make the decision to change -z pac-plt we should at least raise a bugzilla ticket to let the GNU folk know that should they ever implement -z pac-plt in glibc then they should change their implementation to protect the lazy loading slot as the existing reason not to is fragile (relies on GNU ld's RELRO implementation). The original patch did protect it, https://sourceware.org/legacy-ml/binutils/2019-03/msg00025.html, however there was a follow up patch that removed it https://sourceware.org/legacy-ml/binutils/2019-04/msg00208.html

The alternative is to have -z pac-plt=option with a default to the GNU ld, but with an option to change.

@jrtc27
Copy link
Collaborator

jrtc27 commented Mar 25, 2025

In FreeBSD we support multiple RELRO segments within a single object. Perhaps it's time for other implementations to do the same, then LLD can cover whatever set of non-contiguous regions it likes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

6 participants