Skip to content

[LLVM][TableGen] Decrease code size of Intrinsic::getAttributes #110573

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 1, 2024

Conversation

jurahul
Copy link
Contributor

@jurahul jurahul commented Sep 30, 2024

Decrease code size of Intrinsic::getAttributes function by uniquing the function and argument attributes separately and using the IntrinsicsToAttributesMap to store argument attribute ID in low 8 bits and function attribute ID in upper 8 bits.

This reduces the number of cases to handle in the generated switch from 368 to 131, which is ~2.8x reduction in the number of switch cases.

Also eliminate the fixed size array AS and NumAttrs variable, and instead call AttributeList::get directly from each case, with an inline array of the <index, AttribueSet> pairs.

@jurahul jurahul force-pushed the get_attributes_code_size branch 2 times, most recently from d78ed68 to ec4c7ad Compare September 30, 2024 22:47
@jurahul jurahul marked this pull request as ready for review October 1, 2024 03:03
@llvmbot
Copy link
Member

llvmbot commented Oct 1, 2024

@llvm/pr-subscribers-tablegen

Author: Rahul Joshi (jurahul)

Changes

Decrease code size of Intrinsic::getAttributes function by uniquing the function and argument attributes separatey and using the IntrinsicsToAttributesMap to store argument attribute ID in low 8 bits and function attribute ID in upper 8 bits.

This reduces the number of cases to handle in the generated switch from 368 to 131, which is ~2.8x reduction in the number of switch cases.

Also eliminate the fixed size array AS and NumAttrs variable, and instead call AttributeList::get directly from each case, with an inline array of the <index, AttribueSet> pairs.


Full diff: https://github.com/llvm/llvm-project/pull/110573.diff

2 Files Affected:

  • (modified) llvm/test/TableGen/intrinsic-attrs.td (+11-10)
  • (modified) llvm/utils/TableGen/IntrinsicEmitter.cpp (+80-70)
diff --git a/llvm/test/TableGen/intrinsic-attrs.td b/llvm/test/TableGen/intrinsic-attrs.td
index 3228b32405103e..579b5e8a21b868 100644
--- a/llvm/test/TableGen/intrinsic-attrs.td
+++ b/llvm/test/TableGen/intrinsic-attrs.td
@@ -2,7 +2,6 @@
 
 include "llvm/IR/Intrinsics.td"
 
-// ... this intrinsic.
 def int_random_gen   : Intrinsic<[llvm_i32_ty], [], [IntrNoMem, IntrHasSideEffects]>;
 
 def int_deref_ptr_ret : Intrinsic<[llvm_ptr_ty], [], [Dereferenceable<RetIndex, 16>]>;
@@ -24,14 +23,16 @@ def int_deref_ptr_ret : Intrinsic<[llvm_ptr_ty], [], [Dereferenceable<RetIndex,
 // CHECK-NEXT: });
 
 
-// CHECK: 1, // llvm.deref.ptr.ret
-// CHECK: 2, // llvm.random.gen
+// CHECK: getAttributes(LLVMContext &C, ID id)
+// CHECK: 0 << 8 | 0, // llvm.deref.ptr.ret
+// CHECK: 1 << 8 | 1, // llvm.random.gen
 
 // CHECK: case 1:
-// CHECK-NEXT: AS[0] = {0, getIntrinsicArgAttributeSet(C, 0)};
-// CHECK-NEXT: AS[1] = {AttributeList::FunctionIndex, getIntrinsicFnAttributeSet(C, 0)};
-// CHECK-NEXT: NumAttrs = 2;
-
-// CHECK: case 2:
-// CHECK-NEXT: AS[0] = {AttributeList::FunctionIndex, getIntrinsicFnAttributeSet(C, 1)};
-// CHECK-NEXT: NumAttrs = 1;
+// CHECK-NEXT: return AttributeList::get(C, {
+// CHECK-NEXT:   {AttributeList::FunctionIndex, getIntrinsicFnAttributeSet(C, FnAttrID)}
+// CHECK-NEXT: });
+// CHECK-NEXT: case 0:
+// CHECK-NEXT: return AttributeList::get(C, {
+// CHECK-NEXT:   {0, getIntrinsicArgAttributeSet(C, 0)},
+// CHECK-NEXT:   {AttributeList::FunctionIndex, getIntrinsicFnAttributeSet(C, FnAttrID)}
+// CHECK-NEXT: });
diff --git a/llvm/utils/TableGen/IntrinsicEmitter.cpp b/llvm/utils/TableGen/IntrinsicEmitter.cpp
index efa067e60de439..1899b5e913b72a 100644
--- a/llvm/utils/TableGen/IntrinsicEmitter.cpp
+++ b/llvm/utils/TableGen/IntrinsicEmitter.cpp
@@ -29,6 +29,7 @@
 #include <array>
 #include <cassert>
 #include <cctype>
+#include <limits>
 #include <map>
 #include <optional>
 #include <string>
@@ -379,8 +380,17 @@ static constexpr {} IIT_Table[] = {{
   OS << "#endif\n\n"; // End of GET_INTRINSIC_GENERATOR_GLOBAL
 }
 
+/// Returns the effective MemoryEffects for intrinsic \p Int.
+static MemoryEffects getEffectiveME(const CodeGenIntrinsic &Int) {
+  MemoryEffects ME = Int.ME;
+  // TODO: IntrHasSideEffects should affect not only readnone intrinsics.
+  if (ME.doesNotAccessMemory() && Int.hasSideEffects)
+    ME = MemoryEffects::unknown();
+  return ME;
+}
+
 static bool compareFnAttributes(const CodeGenIntrinsic *L,
-                                const CodeGenIntrinsic *R, bool Default) {
+                                const CodeGenIntrinsic *R) {
   auto TieBoolAttributes = [](const CodeGenIntrinsic *I) -> auto {
     // Sort throwing intrinsics after non-throwing intrinsics.
     return std::tie(I->canThrow, I->isNoDuplicate, I->isNoMerge, I->isNoReturn,
@@ -396,50 +406,46 @@ static bool compareFnAttributes(const CodeGenIntrinsic *L,
     return TieL < TieR;
 
   // Try to order by readonly/readnone attribute.
-  uint32_t LME = L->ME.toIntValue();
-  uint32_t RME = R->ME.toIntValue();
+  uint32_t LME = getEffectiveME(*L).toIntValue();
+  uint32_t RME = getEffectiveME(*R).toIntValue();
   if (LME != RME)
     return LME > RME;
 
-  return Default;
+  return false;
+}
+
+/// Returns true if \p Int has a non-empty set of function attributes. Note that
+/// NoUnwind = !canThrow, so we need to negate it's sense to test if the
+// intrinsic has NoUnwind attribute.
+static bool hasFnAttributes(const CodeGenIntrinsic &Int) {
+  return !Int.canThrow || Int.isNoReturn || Int.isNoCallback || Int.isNoSync ||
+         Int.isNoFree || Int.isWillReturn || Int.isCold || Int.isNoDuplicate ||
+         Int.isNoMerge || Int.isConvergent || Int.isSpeculatable ||
+         Int.isStrictFP || getEffectiveME(Int) != MemoryEffects::unknown();
 }
 
 namespace {
 struct FnAttributeComparator {
   bool operator()(const CodeGenIntrinsic *L, const CodeGenIntrinsic *R) const {
-    return compareFnAttributes(L, R, false);
+    return compareFnAttributes(L, R);
   }
 };
 
 struct AttributeComparator {
   bool operator()(const CodeGenIntrinsic *L, const CodeGenIntrinsic *R) const {
-    // Order by argument attributes if function attributes are equal.
+    // Order all intrinsics with no functiona attributes before all intrinsics
+    // with function attributes.
+    bool HasFnAttrLHS = hasFnAttributes(*L);
+    bool HasFnAttrRHS = hasFnAttributes(*R);
+
+    // Order by argument attributes if function `hasFnAttributes` is equal.
     // This is reliable because each side is already sorted internally.
-    return compareFnAttributes(L, R,
-                               L->ArgumentAttributes < R->ArgumentAttributes);
+    return std::tie(HasFnAttrLHS, L->ArgumentAttributes) <
+           std::tie(HasFnAttrRHS, R->ArgumentAttributes);
   }
 };
 } // End anonymous namespace
 
-/// Returns the effective MemoryEffects for intrinsic \p Int.
-static MemoryEffects getEffectiveME(const CodeGenIntrinsic &Int) {
-  MemoryEffects ME = Int.ME;
-  // TODO: IntrHasSideEffects should affect not only readnone intrinsics.
-  if (ME.doesNotAccessMemory() && Int.hasSideEffects)
-    ME = MemoryEffects::unknown();
-  return ME;
-}
-
-/// Returns true if \p Int has a non-empty set of function attributes. Note that
-/// NoUnwind = !canThrow, so we need to negate it's sense to test if the
-// intrinsic has NoUnwind attribute.
-static bool hasFnAttributes(const CodeGenIntrinsic &Int) {
-  return !Int.canThrow || Int.isNoReturn || Int.isNoCallback || Int.isNoSync ||
-         Int.isNoFree || Int.isWillReturn || Int.isCold || Int.isNoDuplicate ||
-         Int.isNoMerge || Int.isConvergent || Int.isSpeculatable ||
-         Int.isStrictFP || getEffectiveME(Int) != MemoryEffects::unknown();
-}
-
 /// Returns the name of the IR enum for argument attribute kind \p Kind.
 static StringRef getArgAttrEnumName(CodeGenIntrinsic::ArgAttrKind Kind) {
   switch (Kind) {
@@ -576,75 +582,79 @@ static AttributeSet getIntrinsicFnAttributeSet(LLVMContext &C, unsigned ID) {
 AttributeList Intrinsic::getAttributes(LLVMContext &C, ID id) {
 )";
 
-  // Compute the maximum number of attribute arguments and the map.
-  typedef std::map<const CodeGenIntrinsic *, unsigned, AttributeComparator>
-      UniqAttrMapTy;
-  UniqAttrMapTy UniqAttributes;
-  unsigned MaxArgAttrs = 0;
-  unsigned AttrNum = 0;
+  // Compute the maximum number of attribute arguments and the map. For function
+  // attributes, we only consider whether the intrinsics has any function
+  // arguments or not.
+  std::map<const CodeGenIntrinsic *, unsigned, AttributeComparator>
+      UniqAttributes;
   for (const CodeGenIntrinsic &Int : Ints) {
-    MaxArgAttrs =
-        std::max(MaxArgAttrs, unsigned(Int.ArgumentAttributes.size()));
-    unsigned &N = UniqAttributes[&Int];
-    if (N)
-      continue;
-    N = ++AttrNum;
-    assert(N < 65536 && "Too many unique attributes for table!");
+    unsigned ID = UniqAttributes.size();
+    UniqAttributes.try_emplace(&Int, ID);
   }
 
+  // Assign a 16-bit packed ID for each intrinsic. The lower 8-bits will be its
+  // "argument attribute ID" (index in UniqAttributes) and upper 8 bits will be
+  // its "function attribute ID" (index in UniqFnAttributes).
+  assert(UniqAttributes.size() < 256 &&
+         "Too many unique argument attributes for table!");
+  assert(UniqFnAttributes.size() < 256 &&
+         "Too many unique function attributes for table!");
+
   // Emit an array of AttributeList.  Most intrinsics will have at least one
   // entry, for the function itself (index ~1), which is usually nounwind.
   OS << "  static constexpr uint16_t IntrinsicsToAttributesMap[] = {";
-  for (const CodeGenIntrinsic &Int : Ints)
-    OS << formatv("\n    {}, // {}", UniqAttributes[&Int], Int.Name);
+  for (const CodeGenIntrinsic &Int : Ints) {
+    uint16_t FnAttrIndex = hasFnAttributes(Int) ? UniqFnAttributes[&Int] : 0;
+    OS << formatv("\n    {} << 8 | {}, // {}", FnAttrIndex,
+                  UniqAttributes[&Int], Int.Name);
+  }
 
   OS << formatv(R"(
   };
-  std::pair<unsigned, AttributeSet> AS[{}];
-  unsigned NumAttrs = 0;
-  if (id != 0) {{
-    switch(IntrinsicsToAttributesMap[id - 1]) {{
-      default: llvm_unreachable("Invalid attribute number");
-)",
-                MaxArgAttrs + 1);
+  if (id == 0)
+    return AttributeList();
+
+  uint16_t PackedID = IntrinsicsToAttributesMap[id - 1];
+  uint8_t FnAttrID = PackedID >> 8;
+  switch(PackedID & 0xFF) {{
+    default: llvm_unreachable("Invalid attribute number");
+)");
 
   for (const auto [IntPtr, UniqueID] : UniqAttributes) {
-    OS << formatv("    case {}:\n", UniqueID);
+    OS << formatv("  case {}:\n", UniqueID);
     const CodeGenIntrinsic &Int = *IntPtr;
 
     // Keep track of the number of attributes we're writing out.
-    unsigned NumAttrs = 0;
+    unsigned NumAttrs =
+        llvm::count_if(Int.ArgumentAttributes,
+                       [](const auto &Attrs) { return !Attrs.empty(); });
+    NumAttrs += hasFnAttributes(Int);
+    if (NumAttrs == 0) {
+      OS << "    return AttributeList();\n";
+      continue;
+    }
 
+    OS << "    return AttributeList::get(C, {\n";
+    ListSeparator LS(",\n");
     for (const auto &[AttrIdx, Attrs] : enumerate(Int.ArgumentAttributes)) {
       if (Attrs.empty())
         continue;
 
       unsigned ArgAttrID = UniqArgAttributes.find(Attrs)->second;
-      OS << formatv(
-          "      AS[{}] = {{{}, getIntrinsicArgAttributeSet(C, {})};\n",
-          NumAttrs++, AttrIdx, ArgAttrID);
+      OS << LS
+         << formatv("      {{{}, getIntrinsicArgAttributeSet(C, {})}", AttrIdx,
+                    ArgAttrID);
     }
 
     if (hasFnAttributes(Int)) {
-      unsigned FnAttrID = UniqFnAttributes.find(&Int)->second;
-      OS << formatv("      AS[{}] = {{AttributeList::FunctionIndex, "
-                    "getIntrinsicFnAttributeSet(C, {})};\n",
-                    NumAttrs++, FnAttrID);
-    }
-
-    if (NumAttrs) {
-      OS << formatv(R"(      NumAttrs = {};
-      break;
-)",
-                    NumAttrs);
-    } else {
-      OS << "      return AttributeList();\n";
+      OS << LS
+         << "      {AttributeList::FunctionIndex, "
+            "getIntrinsicFnAttributeSet(C, FnAttrID)}";
     }
+    OS << "\n    });\n";
   }
 
-  OS << R"(    }
-  }
-  return AttributeList::get(C, ArrayRef(AS, NumAttrs));
+  OS << R"(  }
 }
 #endif // GET_INTRINSIC_ATTRIBUTES
 

Decrease code size of `Intrinsic::getAttributes` function by
uniquing the function and argument attributes separatey and
using the `IntrinsicsToAttributesMap` to store argument attribute
ID in low 8 bits and function attribute ID in upper 8 bits.

This reduces the number of cases to handle in the generated switch
from 368 to 131, which is ~2.8x reduction in the number of switch
cases.

Also eliminate the fixed size array `AS` and `NumAttrs` variable,
and instead call `AttributeList::get` directly from each case, with
an inline array of the <index, AttribueSet> pairs.
@jurahul jurahul force-pushed the get_attributes_code_size branch from ec4c7ad to 1dd04f3 Compare October 1, 2024 14:19
@jurahul jurahul requested a review from arsenm October 1, 2024 14:19
@jurahul jurahul changed the title [LLVM][TableGen] Decrease code size of getAttributes [LLVM][TableGen] Decrease code size of Intrinsic::getAttributes Oct 1, 2024
@jurahul jurahul merged commit 0de0354 into llvm:main Oct 1, 2024
8 checks passed
@jurahul jurahul deleted the get_attributes_code_size branch October 1, 2024 16:08
Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Oct 3, 2024
…vm#110573)

Decrease code size of `Intrinsic::getAttributes` function by uniquing
the function and argument attributes separately and using the
`IntrinsicsToAttributesMap` to store argument attribute ID in low 8 bits
and function attribute ID in upper 8 bits.

This reduces the number of cases to handle in the generated switch from
368 to 131, which is ~2.8x reduction in the number of switch cases.

Also eliminate the fixed size array `AS` and `NumAttrs` variable, and
instead call `AttributeList::get` directly from each case, with an
inline array of the <index, AttribueSet> pairs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants