Skip to content

[BOLT][AArch64] Keep relocations for linker-relaxed instructions. NFCI #129980

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 6, 2025

Conversation

maksfb
Copy link
Contributor

@maksfb maksfb commented Mar 6, 2025

We used to filter out relocations corresponding to NOP+ADR instruction pairs that were a result of linker "relaxation" optimization. However, these relocations will be useful for reversing the linker optimization. Keep the relocations and ignore them while symbolizing ADR instruction operands.

We used to filter out relocations corresponding to NOP+ADR instruction
pairs that were a result of linker "relaxation" optimization. However,
these relocations will be useful for reversing the linker optimization.
Keep the relocations and ignore them while symbolizing ADR instruction
operands.
@llvmbot
Copy link
Member

llvmbot commented Mar 6, 2025

@llvm/pr-subscribers-bolt

Author: Maksim Panchenko (maksfb)

Changes

We used to filter out relocations corresponding to NOP+ADR instruction pairs that were a result of linker "relaxation" optimization. However, these relocations will be useful for reversing the linker optimization. Keep the relocations and ignore them while symbolizing ADR instruction operands.


Full diff: https://github.com/llvm/llvm-project/pull/129980.diff

2 Files Affected:

  • (modified) bolt/lib/Core/Relocation.cpp (-23)
  • (modified) bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp (+15-4)
diff --git a/bolt/lib/Core/Relocation.cpp b/bolt/lib/Core/Relocation.cpp
index e9a9741bc3716..523ab8480cc90 100644
--- a/bolt/lib/Core/Relocation.cpp
+++ b/bolt/lib/Core/Relocation.cpp
@@ -271,22 +271,11 @@ static bool skipRelocationProcessAArch64(uint64_t &Type, uint64_t Contents) {
     return (Contents & 0xfc000000) == 0x14000000;
   };
 
-  auto IsAdr = [](uint64_t Contents) -> bool {
-    // The bits 31-24 are 0b0xx10000
-    return (Contents & 0x9f000000) == 0x10000000;
-  };
-
   auto IsAddImm = [](uint64_t Contents) -> bool {
     // The bits 30-23 are 0b00100010
     return (Contents & 0x7F800000) == 0x11000000;
   };
 
-  auto IsNop = [](uint64_t Contents) -> bool { return Contents == 0xd503201f; };
-
-  // The linker might eliminate the instruction and replace it with NOP, ignore
-  if (IsNop(Contents))
-    return true;
-
   // The linker might relax ADRP+LDR instruction sequence for loading symbol
   // address from GOT table to ADRP+ADD sequence that would point to the
   // binary-local symbol. Change relocation type in order to process it right.
@@ -332,18 +321,6 @@ static bool skipRelocationProcessAArch64(uint64_t &Type, uint64_t Contents) {
   }
   }
 
-  // The linker might relax ADRP+ADD or ADRP+LDR sequences to the ADR+NOP
-  switch (Type) {
-  default:
-    break;
-  case ELF::R_AARCH64_ADR_PREL_PG_HI21:
-  case ELF::R_AARCH64_ADD_ABS_LO12_NC:
-  case ELF::R_AARCH64_ADR_GOT_PAGE:
-  case ELF::R_AARCH64_LD64_GOT_LO12_NC:
-    if (IsAdr(Contents))
-      return true;
-  }
-
   return false;
 }
 
diff --git a/bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp b/bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp
index 7145e77a1edbb..d08bca6e0fc3e 100644
--- a/bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp
@@ -31,10 +31,6 @@ bool AArch64MCSymbolizer::tryAddingSymbolicOperand(
   if (BC.MIB->isBranch(Inst) || BC.MIB->isCall(Inst))
     return false;
 
-  // TODO: add handling for linker "relaxation". At the moment, relocations
-  // corresponding to "relaxed" instructions are excluded from BinaryFunction
-  // relocation list.
-
   const uint64_t InstOffset = InstAddress - Function.getAddress();
   const Relocation *Relocation = Function.getRelocationAt(InstOffset);
 
@@ -49,6 +45,21 @@ bool AArch64MCSymbolizer::tryAddingSymbolicOperand(
         BC.MIB->getTargetExprFor(Inst, Expr, *Ctx, RelType)));
   };
 
+  // The linker can convert ADRP+ADD and ADRP+LDR instruction sequences into
+  // NOP+ADR. After the conversion, the linker might keep the relocations and
+  // if we try to symbolize ADR's operand using outdated relocations, we might
+  // get unexpected results. Hence, we check for the conversion/relaxation, and
+  // ignore the relocation. The symbolization is done based on the PC-relative
+  // value of the operand instead.
+  if (Relocation && BC.MIB->isADR(Inst)) {
+    if (Relocation->Type == ELF::R_AARCH64_ADD_ABS_LO12_NC ||
+        Relocation->Type == ELF::R_AARCH64_LD64_GOT_LO12_NC) {
+      LLVM_DEBUG(dbgs() << "BOLT-DEBUG: ignoring relocation at 0x"
+                        << Twine::utohexstr(InstAddress) << '\n');
+      Relocation = nullptr;
+    }
+  }
+
   if (Relocation) {
     addOperand(Relocation->Symbol, Relocation->Addend, Relocation->Type);
     return true;

@maksfb maksfb merged commit a28daa7 into llvm:main Mar 6, 2025
12 checks passed
@maksfb maksfb deleted the gh-arm-preserve-relocs branch March 17, 2025 02:41
jph-13 pushed a commit to jph-13/llvm-project that referenced this pull request Mar 21, 2025
llvm#129980)

We used to filter out relocations corresponding to NOP+ADR instruction
pairs that were a result of linker "relaxation" optimization. However,
these relocations will be useful for reversing the linker optimization.
Keep the relocations and ignore them while symbolizing ADR instruction
operands.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants