Skip to content

Reapply "[llvm][IR] Extend BranchWeightMetadata to track provenance of weights" #95136

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

ilovepi
Copy link
Contributor

@ilovepi ilovepi commented Jun 11, 2024

Reverts #95060, and relands #86609, with the unintended code generation changes addressed.

This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
llvm.expect* intrinsics from those added via other methods, e.g.
from profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.

@llvmbot llvmbot added clang Clang issues not falling into any other category vectorizers PGO Profile Guided Optimizations llvm:ir llvm:transforms labels Jun 11, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 11, 2024

@llvm/pr-subscribers-llvm-ir
@llvm/pr-subscribers-pgo
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-clang

Author: Paul Kirth (ilovepi)

Changes

Reverts llvm/llvm-project#95060, and relands #86609, with the unintended code generation changes addressed.

This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
llvm.expect* intrinsics from those added via other methods, e.g.
from profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.


Patch is 38.70 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/95136.diff

30 Files Affected:

  • (modified) clang/test/CodeGenCXX/attr-likelihood-if-vs-builtin-expect.cpp (+2-2)
  • (modified) llvm/docs/BranchWeightMetadata.rst (+7)
  • (modified) llvm/include/llvm/IR/MDBuilder.h (+9-2)
  • (modified) llvm/include/llvm/IR/ProfDataUtils.h (+16-1)
  • (modified) llvm/lib/CodeGen/CodeGenPrepare.cpp (+2-1)
  • (modified) llvm/lib/IR/Instruction.cpp (+15-4)
  • (modified) llvm/lib/IR/Instructions.cpp (+5-1)
  • (modified) llvm/lib/IR/MDBuilder.cpp (+9-5)
  • (modified) llvm/lib/IR/Metadata.cpp (+4-4)
  • (modified) llvm/lib/IR/ProfDataUtils.cpp (+31-9)
  • (modified) llvm/lib/IR/Verifier.cpp (+6-3)
  • (modified) llvm/lib/Transforms/IPO/SampleProfile.cpp (+4-3)
  • (modified) llvm/lib/Transforms/Instrumentation/ControlHeightReduction.cpp (+1-1)
  • (modified) llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp (+2-1)
  • (modified) llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (+3-2)
  • (modified) llvm/lib/Transforms/Scalar/JumpThreading.cpp (+2-2)
  • (modified) llvm/lib/Transforms/Scalar/LowerExpectIntrinsic.cpp (+9-7)
  • (modified) llvm/lib/Transforms/Utils/Local.cpp (+1-1)
  • (modified) llvm/lib/Transforms/Utils/LoopPeel.cpp (+2-2)
  • (modified) llvm/lib/Transforms/Utils/LoopRotationUtils.cpp (+2-2)
  • (modified) llvm/lib/Transforms/Utils/SimplifyCFG.cpp (+13-10)
  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+6-6)
  • (modified) llvm/test/Transforms/LowerExpectIntrinsic/basic.ll (+4-4)
  • (modified) llvm/test/Transforms/LowerExpectIntrinsic/expect-with-probability.ll (+4-4)
  • (modified) llvm/test/Transforms/LowerExpectIntrinsic/expect_nonboolean.ll (+2-3)
  • (modified) llvm/test/Transforms/LowerExpectIntrinsic/phi_merge.ll (+2-2)
  • (modified) llvm/test/Transforms/LowerExpectIntrinsic/phi_or.ll (+2-2)
  • (modified) llvm/test/Transforms/LowerExpectIntrinsic/phi_tern.ll (+1-1)
  • (modified) llvm/test/Transforms/LowerExpectIntrinsic/phi_unexpect.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/hoist-runtime-checks.ll (+1-1)
diff --git a/clang/test/CodeGenCXX/attr-likelihood-if-vs-builtin-expect.cpp b/clang/test/CodeGenCXX/attr-likelihood-if-vs-builtin-expect.cpp
index fb236aeb982e0..81d9334356520 100644
--- a/clang/test/CodeGenCXX/attr-likelihood-if-vs-builtin-expect.cpp
+++ b/clang/test/CodeGenCXX/attr-likelihood-if-vs-builtin-expect.cpp
@@ -221,5 +221,5 @@ void tu2(int &i) {
   }
 }
 
-// CHECK: [[BW_LIKELY]] = !{!"branch_weights", i32 2000, i32 1}
-// CHECK: [[BW_UNLIKELY]] = !{!"branch_weights", i32 1, i32 2000}
+// CHECK: [[BW_LIKELY]] = !{!"branch_weights", !"expected", i32 2000, i32 1}
+// CHECK: [[BW_UNLIKELY]] = !{!"branch_weights", !"expected", i32 1, i32 2000}
diff --git a/llvm/docs/BranchWeightMetadata.rst b/llvm/docs/BranchWeightMetadata.rst
index 522f37cdad4fc..62204753e29b0 100644
--- a/llvm/docs/BranchWeightMetadata.rst
+++ b/llvm/docs/BranchWeightMetadata.rst
@@ -28,11 +28,14 @@ Supported Instructions
 
 Metadata is only assigned to the conditional branches. There are two extra
 operands for the true and the false branch.
+We optionally track if the metadata was added by ``__builtin_expect`` or
+``__builtin_expect_with_probability`` with an optional field ``!"expected"``.
 
 .. code-block:: none
 
   !0 = !{
     !"branch_weights",
+    [ !"expected", ]
     i32 <TRUE_BRANCH_WEIGHT>,
     i32 <FALSE_BRANCH_WEIGHT>
   }
@@ -47,6 +50,7 @@ is always case #0).
 
   !0 = !{
     !"branch_weights",
+    [ !"expected", ]
     i32 <DEFAULT_BRANCH_WEIGHT>
     [ , i32 <CASE_BRANCH_WEIGHT> ... ]
   }
@@ -60,6 +64,7 @@ Branch weights are assigned to every destination.
 
   !0 = !{
     !"branch_weights",
+    [ !"expected", ]
     i32 <LABEL_BRANCH_WEIGHT>
     [ , i32 <LABEL_BRANCH_WEIGHT> ... ]
   }
@@ -75,6 +80,7 @@ block and entry counts which may not be accurate with sampling.
 
   !0 = !{
     !"branch_weights",
+    [ !"expected", ]
     i32 <CALL_BRANCH_WEIGHT>
   }
 
@@ -95,6 +101,7 @@ is used.
 
   !0 = !{
     !"branch_weights",
+    [ !"expected", ]
     i32 <INVOKE_NORMAL_WEIGHT>
     [ , i32 <INVOKE_UNWIND_WEIGHT> ]
   }
diff --git a/llvm/include/llvm/IR/MDBuilder.h b/llvm/include/llvm/IR/MDBuilder.h
index 3265589b7c8df..e02ec8f5a3d8b 100644
--- a/llvm/include/llvm/IR/MDBuilder.h
+++ b/llvm/include/llvm/IR/MDBuilder.h
@@ -59,7 +59,11 @@ class MDBuilder {
   //===------------------------------------------------------------------===//
 
   /// Return metadata containing two branch weights.
-  MDNode *createBranchWeights(uint32_t TrueWeight, uint32_t FalseWeight);
+  /// @param TrueWeight the weight of the true branch
+  /// @param FalseWeight the weight of the false branch
+  /// @param Do these weights come from __builtin_expect*
+  MDNode *createBranchWeights(uint32_t TrueWeight, uint32_t FalseWeight,
+                              bool IsExpected = false);
 
   /// Return metadata containing two branch weights, with significant bias
   /// towards `true` destination.
@@ -70,7 +74,10 @@ class MDBuilder {
   MDNode *createUnlikelyBranchWeights();
 
   /// Return metadata containing a number of branch weights.
-  MDNode *createBranchWeights(ArrayRef<uint32_t> Weights);
+  /// @param Weights the weights of all the branches
+  /// @param Do these weights come from __builtin_expect*
+  MDNode *createBranchWeights(ArrayRef<uint32_t> Weights,
+                              bool IsExpected = false);
 
   /// Return metadata specifying that a branch or switch is unpredictable.
   MDNode *createUnpredictable();
diff --git a/llvm/include/llvm/IR/ProfDataUtils.h b/llvm/include/llvm/IR/ProfDataUtils.h
index 88fbad4d6b9d8..1d7c97d9be953 100644
--- a/llvm/include/llvm/IR/ProfDataUtils.h
+++ b/llvm/include/llvm/IR/ProfDataUtils.h
@@ -55,6 +55,17 @@ MDNode *getBranchWeightMDNode(const Instruction &I);
 /// Nullptr otherwise.
 MDNode *getValidBranchWeightMDNode(const Instruction &I);
 
+/// Check if Branch Weight Metadata has an "expected" field from an llvm.expect*
+/// intrinsic
+bool hasBranchWeightOrigin(const Instruction &I);
+
+/// Check if Branch Weight Metadata has an "expected" field from an llvm.expect*
+/// intrinsic
+bool hasBranchWeightOrigin(const MDNode *ProfileData);
+
+/// Return the offset to the first branch weight data
+unsigned getBranchWeightOffset(const MDNode *ProfileData);
+
 /// Extract branch weights from MD_prof metadata
 ///
 /// \param ProfileData A pointer to an MDNode.
@@ -111,7 +122,11 @@ bool extractProfTotalWeight(const Instruction &I, uint64_t &TotalWeights);
 
 /// Create a new `branch_weights` metadata node and add or overwrite
 /// a `prof` metadata reference to instruction `I`.
-void setBranchWeights(Instruction &I, ArrayRef<uint32_t> Weights);
+/// \param I the Instruction to set branch weights on.
+/// \param Weights an array of weights to set on instruction I.
+/// \param IsExpected were these weights added from an llvm.expect* intrinsic.
+void setBranchWeights(Instruction &I, ArrayRef<uint32_t> Weights,
+                      bool IsExpected);
 
 /// Scaling the profile data attached to 'I' using the ratio of S/T.
 void scaleProfData(Instruction &I, uint64_t S, uint64_t T);
diff --git a/llvm/lib/CodeGen/CodeGenPrepare.cpp b/llvm/lib/CodeGen/CodeGenPrepare.cpp
index 339a1f1f2f002..0e01080bd75cc 100644
--- a/llvm/lib/CodeGen/CodeGenPrepare.cpp
+++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp
@@ -8866,7 +8866,8 @@ bool CodeGenPrepare::splitBranchCondition(Function &F, ModifyDT &ModifiedDT) {
         scaleWeights(NewTrueWeight, NewFalseWeight);
         Br1->setMetadata(LLVMContext::MD_prof,
                          MDBuilder(Br1->getContext())
-                             .createBranchWeights(TrueWeight, FalseWeight));
+                             .createBranchWeights(TrueWeight, FalseWeight,
+                                                  hasBranchWeightOrigin(*Br1)));
 
         NewTrueWeight = TrueWeight;
         NewFalseWeight = 2 * FalseWeight;
diff --git a/llvm/lib/IR/Instruction.cpp b/llvm/lib/IR/Instruction.cpp
index 29272e627a1d1..aec927a8cf31d 100644
--- a/llvm/lib/IR/Instruction.cpp
+++ b/llvm/lib/IR/Instruction.cpp
@@ -1268,12 +1268,23 @@ Instruction *Instruction::cloneImpl() const {
 
 void Instruction::swapProfMetadata() {
   MDNode *ProfileData = getBranchWeightMDNode(*this);
-  if (!ProfileData || ProfileData->getNumOperands() != 3)
+  if (!ProfileData)
+    return;
+  unsigned FirstIdx = getBranchWeightOffset(ProfileData);
+  if (ProfileData->getNumOperands() != 2 + FirstIdx)
     return;
 
-  // The first operand is the name. Fetch them backwards and build a new one.
-  Metadata *Ops[] = {ProfileData->getOperand(0), ProfileData->getOperand(2),
-                     ProfileData->getOperand(1)};
+  unsigned SecondIdx = FirstIdx + 1;
+  SmallVector<Metadata *, 4> Ops;
+  // If there are more weights past the second, we can't swap them
+  if (ProfileData->getNumOperands() > SecondIdx + 1)
+    return;
+  for (unsigned Idx = 0; Idx < FirstIdx; ++Idx) {
+    Ops.push_back(ProfileData->getOperand(Idx));
+  }
+  // Switch the order of the weights
+  Ops.push_back(ProfileData->getOperand(SecondIdx));
+  Ops.push_back(ProfileData->getOperand(FirstIdx));
   setMetadata(LLVMContext::MD_prof,
               MDNode::get(ProfileData->getContext(), Ops));
 }
diff --git a/llvm/lib/IR/Instructions.cpp b/llvm/lib/IR/Instructions.cpp
index 1213f078d05ec..de369bd62a617 100644
--- a/llvm/lib/IR/Instructions.cpp
+++ b/llvm/lib/IR/Instructions.cpp
@@ -5199,7 +5199,11 @@ void SwitchInstProfUpdateWrapper::init() {
   if (!ProfileData)
     return;
 
-  if (ProfileData->getNumOperands() != SI.getNumSuccessors() + 1) {
+  // FIXME: This check belongs in ProfDataUtils. Its almost equivalent to
+  // getValidBranchWeightMDNode(), but the need to use llvm_unreachable
+  // makes them slightly different.
+  if (ProfileData->getNumOperands() !=
+      SI.getNumSuccessors() + getBranchWeightOffset(ProfileData)) {
     llvm_unreachable("number of prof branch_weights metadata operands does "
                      "not correspond to number of succesors");
   }
diff --git a/llvm/lib/IR/MDBuilder.cpp b/llvm/lib/IR/MDBuilder.cpp
index bd68db3a6f961..000027754d13e 100644
--- a/llvm/lib/IR/MDBuilder.cpp
+++ b/llvm/lib/IR/MDBuilder.cpp
@@ -35,8 +35,8 @@ MDNode *MDBuilder::createFPMath(float Accuracy) {
 }
 
 MDNode *MDBuilder::createBranchWeights(uint32_t TrueWeight,
-                                       uint32_t FalseWeight) {
-  return createBranchWeights({TrueWeight, FalseWeight});
+                                       uint32_t FalseWeight, bool IsExpected) {
+  return createBranchWeights({TrueWeight, FalseWeight}, IsExpected);
 }
 
 MDNode *MDBuilder::createLikelyBranchWeights() {
@@ -49,15 +49,19 @@ MDNode *MDBuilder::createUnlikelyBranchWeights() {
   return createBranchWeights(1, (1U << 20) - 1);
 }
 
-MDNode *MDBuilder::createBranchWeights(ArrayRef<uint32_t> Weights) {
+MDNode *MDBuilder::createBranchWeights(ArrayRef<uint32_t> Weights,
+                                       bool IsExpected) {
   assert(Weights.size() >= 1 && "Need at least one branch weights!");
 
-  SmallVector<Metadata *, 4> Vals(Weights.size() + 1);
+  unsigned int Offset = IsExpected ? 2 : 1;
+  SmallVector<Metadata *, 4> Vals(Weights.size() + Offset);
   Vals[0] = createString("branch_weights");
+  if (IsExpected)
+    Vals[1] = createString("expected");
 
   Type *Int32Ty = Type::getInt32Ty(Context);
   for (unsigned i = 0, e = Weights.size(); i != e; ++i)
-    Vals[i + 1] = createConstant(ConstantInt::get(Int32Ty, Weights[i]));
+    Vals[i + Offset] = createConstant(ConstantInt::get(Int32Ty, Weights[i]));
 
   return MDNode::get(Context, Vals);
 }
diff --git a/llvm/lib/IR/Metadata.cpp b/llvm/lib/IR/Metadata.cpp
index b6c932495a145..5f42ce22f72fe 100644
--- a/llvm/lib/IR/Metadata.cpp
+++ b/llvm/lib/IR/Metadata.cpp
@@ -1196,10 +1196,10 @@ MDNode *MDNode::mergeDirectCallProfMetadata(MDNode *A, MDNode *B,
   StringRef AProfName = AMDS->getString();
   StringRef BProfName = BMDS->getString();
   if (AProfName == "branch_weights" && BProfName == "branch_weights") {
-    ConstantInt *AInstrWeight =
-        mdconst::dyn_extract<ConstantInt>(A->getOperand(1));
-    ConstantInt *BInstrWeight =
-        mdconst::dyn_extract<ConstantInt>(B->getOperand(1));
+    ConstantInt *AInstrWeight = mdconst::dyn_extract<ConstantInt>(
+        A->getOperand(getBranchWeightOffset(A)));
+    ConstantInt *BInstrWeight = mdconst::dyn_extract<ConstantInt>(
+        B->getOperand(getBranchWeightOffset(B)));
     assert(AInstrWeight && BInstrWeight && "verified by LLVM verifier");
     return MDNode::get(Ctx,
                        {MDHelper.createString("branch_weights"),
diff --git a/llvm/lib/IR/ProfDataUtils.cpp b/llvm/lib/IR/ProfDataUtils.cpp
index 51e78dc5e6c00..c4b1ed55de8a2 100644
--- a/llvm/lib/IR/ProfDataUtils.cpp
+++ b/llvm/lib/IR/ProfDataUtils.cpp
@@ -40,9 +40,6 @@ namespace {
 // We maintain some constants here to ensure that we access the branch weights
 // correctly, and can change the behavior in the future if the layout changes
 
-// The index at which the weights vector starts
-constexpr unsigned WeightsIdx = 1;
-
 // the minimum number of operands for MD_prof nodes with branch weights
 constexpr unsigned MinBWOps = 3;
 
@@ -75,6 +72,7 @@ static void extractFromBranchWeightMD(const MDNode *ProfileData,
   assert(isBranchWeightMD(ProfileData) && "wrong metadata");
 
   unsigned NOps = ProfileData->getNumOperands();
+  unsigned WeightsIdx = getBranchWeightOffset(ProfileData);
   assert(WeightsIdx < NOps && "Weights Index must be less than NOps.");
   Weights.resize(NOps - WeightsIdx);
 
@@ -82,8 +80,8 @@ static void extractFromBranchWeightMD(const MDNode *ProfileData,
     ConstantInt *Weight =
         mdconst::dyn_extract<ConstantInt>(ProfileData->getOperand(Idx));
     assert(Weight && "Malformed branch_weight in MD_prof node");
-    assert(Weight->getValue().getActiveBits() <= 32 &&
-           "Too many bits for uint32_t");
+    assert(Weight->getValue().getActiveBits() <= (sizeof(T) * 8) &&
+           "Too many bits for MD_prof branch_weight");
     Weights[Idx - WeightsIdx] = Weight->getZExtValue();
   }
 }
@@ -123,6 +121,26 @@ bool hasValidBranchWeightMD(const Instruction &I) {
   return getValidBranchWeightMDNode(I);
 }
 
+bool hasBranchWeightOrigin(const Instruction &I) {
+  auto *ProfileData = I.getMetadata(LLVMContext::MD_prof);
+  return hasBranchWeightOrigin(ProfileData);
+}
+
+bool hasBranchWeightOrigin(const MDNode *ProfileData) {
+  if (!isBranchWeightMD(ProfileData))
+    return false;
+  auto *ProfDataName = dyn_cast<MDString>(ProfileData->getOperand(1));
+  // NOTE: if we ever have more types of branch weight provenance,
+  // we need to check the string value is "expected". For now, we
+  // supply a more generic API, and avoid the spurious comparisons.
+  assert(ProfDataName == nullptr || ProfDataName->getString() == "expected");
+  return ProfDataName != nullptr;
+}
+
+unsigned getBranchWeightOffset(const MDNode *ProfileData) {
+  return hasBranchWeightOrigin(ProfileData) ? 2 : 1;
+}
+
 MDNode *getBranchWeightMDNode(const Instruction &I) {
   auto *ProfileData = I.getMetadata(LLVMContext::MD_prof);
   if (!isBranchWeightMD(ProfileData))
@@ -132,7 +150,9 @@ MDNode *getBranchWeightMDNode(const Instruction &I) {
 
 MDNode *getValidBranchWeightMDNode(const Instruction &I) {
   auto *ProfileData = getBranchWeightMDNode(I);
-  if (ProfileData && ProfileData->getNumOperands() == 1 + I.getNumSuccessors())
+  auto Offset = getBranchWeightOffset(ProfileData);
+  if (ProfileData &&
+      ProfileData->getNumOperands() == Offset + I.getNumSuccessors())
     return ProfileData;
   return nullptr;
 }
@@ -191,7 +211,8 @@ bool extractProfTotalWeight(const MDNode *ProfileData, uint64_t &TotalVal) {
     return false;
 
   if (ProfDataName->getString() == "branch_weights") {
-    for (unsigned Idx = 1; Idx < ProfileData->getNumOperands(); Idx++) {
+    unsigned Offset = getBranchWeightOffset(ProfileData);
+    for (unsigned Idx = Offset; Idx < ProfileData->getNumOperands(); ++Idx) {
       auto *V = mdconst::dyn_extract<ConstantInt>(ProfileData->getOperand(Idx));
       assert(V && "Malformed branch_weight in MD_prof node");
       TotalVal += V->getValue().getZExtValue();
@@ -212,9 +233,10 @@ bool extractProfTotalWeight(const Instruction &I, uint64_t &TotalVal) {
   return extractProfTotalWeight(I.getMetadata(LLVMContext::MD_prof), TotalVal);
 }
 
-void setBranchWeights(Instruction &I, ArrayRef<uint32_t> Weights) {
+void setBranchWeights(Instruction &I, ArrayRef<uint32_t> Weights,
+                      bool IsExpected) {
   MDBuilder MDB(I.getContext());
-  MDNode *BranchWeights = MDB.createBranchWeights(Weights);
+  MDNode *BranchWeights = MDB.createBranchWeights(Weights, IsExpected);
   I.setMetadata(LLVMContext::MD_prof, BranchWeights);
 }
 
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index e5927203f33a2..fe2253dd04eb3 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -104,6 +104,7 @@
 #include "llvm/IR/Module.h"
 #include "llvm/IR/ModuleSlotTracker.h"
 #include "llvm/IR/PassManager.h"
+#include "llvm/IR/ProfDataUtils.h"
 #include "llvm/IR/Statepoint.h"
 #include "llvm/IR/Type.h"
 #include "llvm/IR/Use.h"
@@ -4808,8 +4809,10 @@ void Verifier::visitProfMetadata(Instruction &I, MDNode *MD) {
 
   // Check consistency of !prof branch_weights metadata.
   if (ProfName == "branch_weights") {
+    unsigned int Offset = getBranchWeightOffset(MD);
     if (isa<InvokeInst>(&I)) {
-      Check(MD->getNumOperands() == 2 || MD->getNumOperands() == 3,
+      Check(MD->getNumOperands() == (1 + Offset) ||
+                MD->getNumOperands() == (2 + Offset),
             "Wrong number of InvokeInst branch_weights operands", MD);
     } else {
       unsigned ExpectedNumOperands = 0;
@@ -4829,10 +4832,10 @@ void Verifier::visitProfMetadata(Instruction &I, MDNode *MD) {
         CheckFailed("!prof branch_weights are not allowed for this instruction",
                     MD);
 
-      Check(MD->getNumOperands() == 1 + ExpectedNumOperands,
+      Check(MD->getNumOperands() == Offset + ExpectedNumOperands,
             "Wrong number of operands", MD);
     }
-    for (unsigned i = 1; i < MD->getNumOperands(); ++i) {
+    for (unsigned i = Offset; i < MD->getNumOperands(); ++i) {
       auto &MDO = MD->getOperand(i);
       Check(MDO, "second operand should not be null", MD);
       Check(mdconst::dyn_extract<ConstantInt>(MDO),
diff --git a/llvm/lib/Transforms/IPO/SampleProfile.cpp b/llvm/lib/Transforms/IPO/SampleProfile.cpp
index 92ad4c34da6e7..7e6a8817b7a67 100644
--- a/llvm/lib/Transforms/IPO/SampleProfile.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfile.cpp
@@ -1662,7 +1662,8 @@ void SampleProfileLoader::generateMDProfMetadata(Function &F) {
           else if (OverwriteExistingWeights)
             I.setMetadata(LLVMContext::MD_prof, nullptr);
         } else if (!isa<IntrinsicInst>(&I)) {
-          setBranchWeights(I, {static_cast<uint32_t>(BlockWeights[BB])});
+          setBranchWeights(I, {static_cast<uint32_t>(BlockWeights[BB])},
+                           /*IsExpected=*/false);
         }
       }
     } else if (OverwriteExistingWeights || ProfileSampleBlockAccurate) {
@@ -1673,7 +1674,7 @@ void SampleProfileLoader::generateMDProfMetadata(Function &F) {
           if (cast<CallBase>(I).isIndirectCall()) {
             I.setMetadata(LLVMContext::MD_prof, nullptr);
           } else {
-            setBranchWeights(I, {uint32_t(0)});
+            setBranchWeights(I, {uint32_t(0)}, /*IsExpected=*/false);
           }
         }
       }
@@ -1756,7 +1757,7 @@ void SampleProfileLoader::generateMDProfMetadata(Function &F) {
     if (MaxWeight > 0 &&
         (!TI->extractProfTotalWeight(TempWeight) || OverwriteExistingWeights)) {
       LLVM_DEBUG(dbgs() << "SUCCESS. Found non-zero weights.\n");
-      setBranchWeights(*TI, Weights);
+      setBranchWeights(*TI, Weights, /*IsExpected=*/false);
       ORE->emit([&]() {
         return OptimizationRemark(DEBUG_TYPE, "PopularDest", MaxDestInst)
                << "most popular destination for conditional branches at "
diff --git a/llvm/lib/Transforms/Instrumentation/ControlHeightReduction.cpp b/llvm/lib/Transforms/Instrumentation/ControlHeightReduction.cpp
index 0a3d8d6000cf4..731104d4fcef0 100644
--- a/llvm/lib/Transforms/Instrumentation/ControlHeightReduction.cpp
+++ b/llvm/lib/Transforms/Instrumentation/ControlHeightReduction.cpp
@@ -1878,7 +1878,7 @@ void CHR::fixupBranchesAndSelects(CHRScope *Scope,
       static_cast<uint32_t>(CHRBranchBias.scale(1000)),
       static_cast<uint32_t>(CHRBranchBias.getCompl().scale(1000)),
   };
-  setBranchWeights(*MergedBR, Weights);
+  setBranchWeights(*MergedBR, Weights, /*IsExpected=*/false);
   CHR_DEBUG(dbgs() << "CHR branch bias " << Weights[0] << ":" << Weights[1]
             << "\n");
 }
diff --git a/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp b/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
index 23a7c6a20aecb..6db76ca78b218 100644
--- a/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
+++ b/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
@@ -259,7 +259,8 @@ CallBase &llvm::pgo::promoteIndirectCall(CallBase &CB, Function *DirectCallee,
       promoteCallWithIfThenElse(CB, DirectCallee, BranchWeights);
 
   if (AttachProfToDirectCall) {
-    setBranchWeights(NewInst, {static_cast<uint32_t>(Count)});
+    setBranchWeights(NewInst, {static_cast<uint32_t>(Count)},
+                     /*IsExpected=*/false);
   }
 
   using namespace ore;
diff --git a/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp b/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
index 2269c2e0fffae..ac6d3348b3db9 100644
--- a/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
+++ b/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
@@ -1474,7 +1474,8 @@ void PGOUseFunc::populateCoverage(IndexedInstrProfReader *PGOReader) {
     for (auto *Succ : successors(&BB))
       Weights.push_back((Coverage[Succ] || !Coverage[&BB]) ? 1 : 0);
     if (Weights.size() >= 2)
-      llvm::setBranchWeights(*BB.getTerminator(), Weights);
+      llvm::setBranchWeights(*BB.getTerminator(), Weights,
+                             /*IsExpected=*/false);
   }
 
   unsigned NumCorruptCoverage = 0;
@@ -2260,7 +2261,7 @@ void llvm::setProfMetadata(Module *M, Instruction *TI,
 
   misexpect::checkExpectAnnotations(*TI, Weights, /*IsFrontend=*...
[truncated]

@ilovepi ilovepi marked this pull request as draft June 11, 2024 16:14
@ilovepi
Copy link
Contributor Author

ilovepi commented Jun 11, 2024

Still looking into the codegen issue, but I'd expect there's probably just 1-2 cases where I failed to update a pass to use APIs from ProfDataUtils.

@david-xl
Copy link
Contributor

What is the summary of the new changes that address the compile time issue?

ilovepi added a commit to ilovepi/llvm-project that referenced this pull request Jun 11, 2024
…f weights" llvm#95136

Reverts llvm#95060, and relands llvm#86609, with the unintended code generation changes addressed.

This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
llvm.expect* intrinsics from those added via other methods, e.g.
from profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.
@ilovepi
Copy link
Contributor Author

ilovepi commented Jun 11, 2024

What is the summary of the new changes that address the compile time issue?

I'm still looking at that, that's why I marked it as a draft, but I suspect I just missed updating

APInt Val(128, mdconst::dyn_extract<ConstantInt>(ProfileData->getOperand(1))
. I'm trying the test-suite locally now to verify, and will update the patch, once I confirm it no longer modifies .text

@ilovepi
Copy link
Contributor Author

ilovepi commented Jun 12, 2024

So, I think I made a mistake when I started the revert w/ the revert button. That created a very long branch name, and I'm a bit concerned I can't update it easily.

I'm about 80% sure git push [email protected]:llvm/llvm-project.git HEAD:revert-95060-revert-86609-users/ilovepi/spr/llvmir-extend-branchweightmetadata-to-track-provenance-of-weights will work, but I'm hesitant to push things to the main repo that aren't automated w/ tooling.

I'm much more confident in my ability to update using a PR started from my fork, so I'm closing in favor of #95281

@ilovepi ilovepi closed this Jun 12, 2024
nikic pushed a commit to nikic/llvm-project that referenced this pull request Jun 12, 2024
…f weights" llvm#95136

Reverts llvm#95060, and relands llvm#86609, with the unintended code generation changes addressed.

This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
llvm.expect* intrinsics from those added via other methods, e.g.
from profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.
ilovepi added a commit that referenced this pull request Jun 12, 2024
#95281)

…f weights" #95136

Reverts #95060, and relands #86609, with the unintended code generation
changes addressed.

This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
llvm.expect* intrinsics from those added via other methods, e.g. from
profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.
@nunoplopes nunoplopes deleted the revert-95060-revert-86609-users/ilovepi/spr/llvmir-extend-branchweightmetadata-to-track-provenance-of-weights branch December 28, 2024 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category llvm:ir llvm:transforms PGO Profile Guided Optimizations vectorizers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants