Skip to content

[KeyInstr] Add fields to DILocation behind compile time flag #133477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 1, 2025

Conversation

OCHyams
Copy link
Contributor

@OCHyams OCHyams commented Mar 28, 2025

The definition EXPERIMENTAL_KEY_INSTRUCTIONS is controlled by cmake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS.

Add IR read-write roundtrip test in a directory that is unsupported unless
the CMake flag is enabled.

Copy link
Contributor Author

OCHyams commented Mar 28, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

Copy link

github-actions bot commented Mar 28, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@llvmbot
Copy link
Member

llvmbot commented Mar 28, 2025

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-debuginfo

Author: Orlando Cazalet-Hyams (OCHyams)

Changes

The definition EXPERIMENTAL_KEY_INSTRUCTIONS is controlled by cmake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS.

Add IR read-write roundtrip test in a directory that is unsupported unless
the CMake flag is enabled.


Full diff: https://github.com/llvm/llvm-project/pull/133477.diff

9 Files Affected:

  • (modified) llvm/include/llvm/IR/DebugInfoMetadata.h (+35-11)
  • (modified) llvm/lib/AsmParser/LLParser.cpp (+6-4)
  • (modified) llvm/lib/IR/AsmWriter.cpp (+2)
  • (modified) llvm/lib/IR/DebugInfoMetadata.cpp (+17-7)
  • (modified) llvm/lib/IR/LLVMContextImpl.h (+12-5)
  • (modified) llvm/test/CMakeLists.txt (+1)
  • (added) llvm/test/DebugInfo/KeyInstructions/Generic/parse.ll (+19)
  • (added) llvm/test/DebugInfo/KeyInstructions/lit.local.cfg (+2)
  • (modified) llvm/test/lit.site.cfg.py.in (+1)
diff --git a/llvm/include/llvm/IR/DebugInfoMetadata.h b/llvm/include/llvm/IR/DebugInfoMetadata.h
index 62a59ddaee599..ede3bda5c68f4 100644
--- a/llvm/include/llvm/IR/DebugInfoMetadata.h
+++ b/llvm/include/llvm/IR/DebugInfoMetadata.h
@@ -2099,44 +2099,67 @@ class DISubprogram : public DILocalScope {
 class DILocation : public MDNode {
   friend class LLVMContextImpl;
   friend class MDNode;
+#ifdef EXPERIMENTAL_KEY_INSTRUCTIONS
+  uint64_t AtomGroup : 61;
+  uint8_t AtomRank : 3;
+#endif
 
   DILocation(LLVMContext &C, StorageType Storage, unsigned Line,
-             unsigned Column, ArrayRef<Metadata *> MDs, bool ImplicitCode);
+             unsigned Column, uint64_t AtomGroup, uint8_t AtomRank,
+             ArrayRef<Metadata *> MDs, bool ImplicitCode);
   ~DILocation() { dropAllReferences(); }
 
   static DILocation *getImpl(LLVMContext &Context, unsigned Line,
                              unsigned Column, Metadata *Scope,
                              Metadata *InlinedAt, bool ImplicitCode,
+                             uint64_t AtomGroup, uint8_t AtomRank,
                              StorageType Storage, bool ShouldCreate = true);
   static DILocation *getImpl(LLVMContext &Context, unsigned Line,
                              unsigned Column, DILocalScope *Scope,
                              DILocation *InlinedAt, bool ImplicitCode,
+                             uint64_t AtomGroup, uint8_t AtomRank,
                              StorageType Storage, bool ShouldCreate = true) {
     return getImpl(Context, Line, Column, static_cast<Metadata *>(Scope),
-                   static_cast<Metadata *>(InlinedAt), ImplicitCode, Storage,
-                   ShouldCreate);
+                   static_cast<Metadata *>(InlinedAt), ImplicitCode, AtomGroup,
+                   AtomRank, Storage, ShouldCreate);
   }
 
   TempDILocation cloneImpl() const {
     // Get the raw scope/inlinedAt since it is possible to invoke this on
     // a DILocation containing temporary metadata.
     return getTemporary(getContext(), getLine(), getColumn(), getRawScope(),
-                        getRawInlinedAt(), isImplicitCode());
+                        getRawInlinedAt(), isImplicitCode(), getAtomGroup(),
+                        getAtomRank());
   }
-
 public:
+  uint64_t getAtomGroup() const {
+#ifdef EXPERIMENTAL_KEY_INSTRUCTIONS
+    return AtomGroup;
+#endif
+    return 0;
+  }
+  uint8_t getAtomRank() const {
+#ifdef EXPERIMENTAL_KEY_INSTRUCTIONS
+    return AtomRank;
+#endif
+    return 0;
+  }
+
   // Disallow replacing operands.
   void replaceOperandWith(unsigned I, Metadata *New) = delete;
 
   DEFINE_MDNODE_GET(DILocation,
                     (unsigned Line, unsigned Column, Metadata *Scope,
-                     Metadata *InlinedAt = nullptr, bool ImplicitCode = false),
-                    (Line, Column, Scope, InlinedAt, ImplicitCode))
+                     Metadata *InlinedAt = nullptr, bool ImplicitCode = false,
+                     uint64_t AtomGroup = 0, uint8_t AtomRank = 0),
+                    (Line, Column, Scope, InlinedAt, ImplicitCode, AtomGroup,
+                     AtomRank))
   DEFINE_MDNODE_GET(DILocation,
                     (unsigned Line, unsigned Column, DILocalScope *Scope,
-                     DILocation *InlinedAt = nullptr,
-                     bool ImplicitCode = false),
-                    (Line, Column, Scope, InlinedAt, ImplicitCode))
+                     DILocation *InlinedAt = nullptr, bool ImplicitCode = false,
+                     uint64_t AtomGroup = 0, uint8_t AtomRank = 0),
+                    (Line, Column, Scope, InlinedAt, ImplicitCode, AtomGroup,
+                     AtomRank))
 
   /// Return a (temporary) clone of this.
   TempDILocation clone() const { return cloneImpl(); }
@@ -2511,7 +2534,8 @@ DILocation::cloneWithDiscriminator(unsigned Discriminator) const {
   DILexicalBlockFile *NewScope =
       DILexicalBlockFile::get(getContext(), Scope, getFile(), Discriminator);
   return DILocation::get(getContext(), getLine(), getColumn(), NewScope,
-                         getInlinedAt());
+                         getInlinedAt(), isImplicitCode(), getAtomGroup(),
+                         getAtomRank());
 }
 
 unsigned DILocation::getBaseDiscriminator() const {
diff --git a/llvm/lib/AsmParser/LLParser.cpp b/llvm/lib/AsmParser/LLParser.cpp
index 960119bab0933..5b6b057d34620 100644
--- a/llvm/lib/AsmParser/LLParser.cpp
+++ b/llvm/lib/AsmParser/LLParser.cpp
@@ -5312,13 +5312,15 @@ bool LLParser::parseDILocation(MDNode *&Result, bool IsDistinct) {
   OPTIONAL(column, ColumnField, );                                             \
   REQUIRED(scope, MDField, (/* AllowNull */ false));                           \
   OPTIONAL(inlinedAt, MDField, );                                              \
-  OPTIONAL(isImplicitCode, MDBoolField, (false));
+  OPTIONAL(isImplicitCode, MDBoolField, (false));                              \
+  OPTIONAL(atomGroup, MDUnsignedField, (0, UINT64_MAX));                       \
+  OPTIONAL(atomRank, MDUnsignedField, (0, UINT8_MAX));
   PARSE_MD_FIELDS();
 #undef VISIT_MD_FIELDS
 
-  Result =
-      GET_OR_DISTINCT(DILocation, (Context, line.Val, column.Val, scope.Val,
-                                   inlinedAt.Val, isImplicitCode.Val));
+  Result = GET_OR_DISTINCT(
+      DILocation, (Context, line.Val, column.Val, scope.Val, inlinedAt.Val,
+                   isImplicitCode.Val, atomGroup.Val, atomRank.Val));
   return false;
 }
 
diff --git a/llvm/lib/IR/AsmWriter.cpp b/llvm/lib/IR/AsmWriter.cpp
index 79547b299a903..c3f0ba230cee7 100644
--- a/llvm/lib/IR/AsmWriter.cpp
+++ b/llvm/lib/IR/AsmWriter.cpp
@@ -2073,6 +2073,8 @@ static void writeDILocation(raw_ostream &Out, const DILocation *DL,
   Printer.printMetadata("inlinedAt", DL->getRawInlinedAt());
   Printer.printBool("isImplicitCode", DL->isImplicitCode(),
                     /* Default */ false);
+  Printer.printInt("atomGroup", DL->getAtomGroup());
+  Printer.printInt<unsigned>("atomRank", DL->getAtomRank());
   Out << ")";
 }
 
diff --git a/llvm/lib/IR/DebugInfoMetadata.cpp b/llvm/lib/IR/DebugInfoMetadata.cpp
index ae3d79fc17a59..f4f9fca38945c 100644
--- a/llvm/lib/IR/DebugInfoMetadata.cpp
+++ b/llvm/lib/IR/DebugInfoMetadata.cpp
@@ -56,12 +56,19 @@ DebugVariableAggregate::DebugVariableAggregate(const DbgVariableIntrinsic *DVI)
                     DVI->getDebugLoc()->getInlinedAt()) {}
 
 DILocation::DILocation(LLVMContext &C, StorageType Storage, unsigned Line,
-                       unsigned Column, ArrayRef<Metadata *> MDs,
-                       bool ImplicitCode)
-    : MDNode(C, DILocationKind, Storage, MDs) {
+                       unsigned Column, uint64_t AtomGroup, uint8_t AtomRank,
+                       ArrayRef<Metadata *> MDs, bool ImplicitCode)
+    : MDNode(C, DILocationKind, Storage, MDs)
+#ifdef EXPERIMENTAL_KEY_INSTRUCTIONS
+      ,
+      AtomGroup(AtomGroup), AtomRank(AtomRank)
+#endif
+{
+#ifdef EXPERIMENTAL_KEY_INSTRUCTIONS
+  assert(AtomRank <= 7 && "AtomRank number should fit in 3 bits");
+#endif
   assert((MDs.size() == 1 || MDs.size() == 2) &&
          "Expected a scope and optional inlined-at");
-
   // Set line and column.
   assert(Column < (1u << 16) && "Expected 16-bit column");
 
@@ -80,6 +87,7 @@ static void adjustColumn(unsigned &Column) {
 DILocation *DILocation::getImpl(LLVMContext &Context, unsigned Line,
                                 unsigned Column, Metadata *Scope,
                                 Metadata *InlinedAt, bool ImplicitCode,
+                                uint64_t AtomGroup, uint8_t AtomRank,
                                 StorageType Storage, bool ShouldCreate) {
   // Fixup column.
   adjustColumn(Column);
@@ -87,7 +95,8 @@ DILocation *DILocation::getImpl(LLVMContext &Context, unsigned Line,
   if (Storage == Uniqued) {
     if (auto *N = getUniqued(Context.pImpl->DILocations,
                              DILocationInfo::KeyTy(Line, Column, Scope,
-                                                   InlinedAt, ImplicitCode)))
+                                                   InlinedAt, ImplicitCode,
+                                                   AtomGroup, AtomRank)))
       return N;
     if (!ShouldCreate)
       return nullptr;
@@ -99,8 +108,9 @@ DILocation *DILocation::getImpl(LLVMContext &Context, unsigned Line,
   Ops.push_back(Scope);
   if (InlinedAt)
     Ops.push_back(InlinedAt);
-  return storeImpl(new (Ops.size(), Storage) DILocation(
-                       Context, Storage, Line, Column, Ops, ImplicitCode),
+  return storeImpl(new (Ops.size(), Storage)
+                       DILocation(Context, Storage, Line, Column, AtomGroup,
+                                  AtomRank, Ops, ImplicitCode),
                    Storage, Context.pImpl->DILocations);
 }
 
diff --git a/llvm/lib/IR/LLVMContextImpl.h b/llvm/lib/IR/LLVMContextImpl.h
index a18cf6f205623..99f0d8837de52 100644
--- a/llvm/lib/IR/LLVMContextImpl.h
+++ b/llvm/lib/IR/LLVMContextImpl.h
@@ -319,23 +319,30 @@ template <> struct MDNodeKeyImpl<DILocation> {
   Metadata *Scope;
   Metadata *InlinedAt;
   bool ImplicitCode;
+  uint64_t AtomGroup : 61;
+  uint8_t AtomRank : 3;
 
   MDNodeKeyImpl(unsigned Line, unsigned Column, Metadata *Scope,
-                Metadata *InlinedAt, bool ImplicitCode)
+                Metadata *InlinedAt, bool ImplicitCode, uint64_t AtomGroup,
+                uint8_t AtomRank)
       : Line(Line), Column(Column), Scope(Scope), InlinedAt(InlinedAt),
-        ImplicitCode(ImplicitCode) {}
+        ImplicitCode(ImplicitCode), AtomGroup(AtomGroup), AtomRank(AtomRank) {}
+
   MDNodeKeyImpl(const DILocation *L)
       : Line(L->getLine()), Column(L->getColumn()), Scope(L->getRawScope()),
-        InlinedAt(L->getRawInlinedAt()), ImplicitCode(L->isImplicitCode()) {}
+        InlinedAt(L->getRawInlinedAt()), ImplicitCode(L->isImplicitCode()),
+        AtomGroup(L->getAtomGroup()), AtomRank(L->getAtomRank()) {}
 
   bool isKeyOf(const DILocation *RHS) const {
     return Line == RHS->getLine() && Column == RHS->getColumn() &&
            Scope == RHS->getRawScope() && InlinedAt == RHS->getRawInlinedAt() &&
-           ImplicitCode == RHS->isImplicitCode();
+           ImplicitCode == RHS->isImplicitCode() &&
+           AtomGroup == RHS->getAtomGroup() && AtomRank == RHS->getAtomRank();
   }
 
   unsigned getHashValue() const {
-    return hash_combine(Line, Column, Scope, InlinedAt, ImplicitCode);
+    return hash_combine(Line, Column, Scope, InlinedAt, ImplicitCode, AtomGroup,
+                        AtomRank);
   }
 };
 
diff --git a/llvm/test/CMakeLists.txt b/llvm/test/CMakeLists.txt
index d984193875fa2..abfbcebc746fa 100644
--- a/llvm/test/CMakeLists.txt
+++ b/llvm/test/CMakeLists.txt
@@ -29,6 +29,7 @@ llvm_canonicalize_cmake_booleans(
   LLVM_INCLUDE_SPIRV_TOOLS_TESTS
   LLVM_APPEND_VC_REV
   LLVM_HAS_LOGF128
+  LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS
   )
 
 configure_lit_site_cfg(
diff --git a/llvm/test/DebugInfo/KeyInstructions/Generic/parse.ll b/llvm/test/DebugInfo/KeyInstructions/Generic/parse.ll
new file mode 100644
index 0000000000000..33219a582fee8
--- /dev/null
+++ b/llvm/test/DebugInfo/KeyInstructions/Generic/parse.ll
@@ -0,0 +1,19 @@
+; RUN: opt %s -o - -S| FileCheck %s
+
+; CHECK: !DILocation(line: 1, column: 11, scope: ![[#]], atomGroup: 1, atomRank: 1)
+
+define dso_local void @f() !dbg !10 {
+entry:
+  ret void, !dbg !13
+}
+
+!llvm.module.flags = !{!3}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1, producer: "clang version 21.0.0git", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "test.cpp", directory: "/")
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!9 = !{!"clang version 21.0.0git"}
+!10 = distinct !DISubprogram(name: "f", scope: !1, file: !1, line: 1, type: !11, scopeLine: 1, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0)
+!11 = !DISubroutineType(types: !12)
+!12 = !{null}
+!13 = !DILocation(line: 1, column: 11, scope: !10, atomGroup: 1, atomRank: 1)
diff --git a/llvm/test/DebugInfo/KeyInstructions/lit.local.cfg b/llvm/test/DebugInfo/KeyInstructions/lit.local.cfg
new file mode 100644
index 0000000000000..482bd5c8ac251
--- /dev/null
+++ b/llvm/test/DebugInfo/KeyInstructions/lit.local.cfg
@@ -0,0 +1,2 @@
+if not config.has_key_instructions:
+    config.unsupported = True
diff --git a/llvm/test/lit.site.cfg.py.in b/llvm/test/lit.site.cfg.py.in
index 0d02920323d2d..caee6c1db92ee 100644
--- a/llvm/test/lit.site.cfg.py.in
+++ b/llvm/test/lit.site.cfg.py.in
@@ -65,6 +65,7 @@ config.spirv_tools_tests = @LLVM_INCLUDE_SPIRV_TOOLS_TESTS@
 config.have_vc_rev = @LLVM_APPEND_VC_REV@
 config.force_vc_rev = "@LLVM_FORCE_VC_REVISION@"
 config.has_logf128 = @LLVM_HAS_LOGF128@
+config.has_key_instructions = @LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS@
 
 import lit.llvm
 lit.llvm.initialize(lit_config, config)

@OCHyams OCHyams marked this pull request as ready for review March 28, 2025 17:59
OCHyams added a commit to OCHyams/llvm-project that referenced this pull request Apr 3, 2025
This controls the work-in-progress feature Key Instructions. Once it's no longer
a WIP we can look at making this a Clang option.

The LLVM patch stack starts at llvm#133477.

The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.

The Key Instructions feature is discussed here:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668

The Clang-side work is demoed here:
llvm#130943
@OCHyams OCHyams deleted the users/OCHyams/ki-llvm-fields branch May 1, 2025 14:40
@llvm-ci
Copy link
Collaborator

llvm-ci commented May 1, 2025

LLVM Buildbot has detected a new failure on builder mlir-nvidia-gcc7 running on mlir-nvidia while building llvm at step 7 "test-build-check-mlir-build-only-check-mlir".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/116/builds/12300

Here is the relevant piece of the build log for the reference
Step 7 (test-build-check-mlir-build-only-check-mlir) failure: test (failure)
******************** TEST 'MLIR :: Integration/GPU/CUDA/async.mlir' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -gpu-kernel-outlining  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -pass-pipeline='builtin.module(gpu.module(strip-debuginfo,convert-gpu-to-nvvm),nvvm-attach-target)'  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -gpu-async-region -gpu-to-llvm -reconcile-unrealized-casts -gpu-module-to-binary="format=fatbin"  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -async-to-async-runtime -async-runtime-ref-counting  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -convert-async-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -convert-cf-to-llvm -reconcile-unrealized-casts  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-runner    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_cuda_runtime.so    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_async_runtime.so    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_runner_utils.so    --entry-point-result=void -O0  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -gpu-kernel-outlining
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt '-pass-pipeline=builtin.module(gpu.module(strip-debuginfo,convert-gpu-to-nvvm),nvvm-attach-target)'
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -gpu-async-region -gpu-to-llvm -reconcile-unrealized-casts -gpu-module-to-binary=format=fatbin
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -async-to-async-runtime -async-runtime-ref-counting
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -convert-async-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -convert-cf-to-llvm -reconcile-unrealized-casts
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-runner --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_cuda_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_async_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_runner_utils.so --entry-point-result=void -O0
# .---command stderr------------
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventSynchronize(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# `-----------------------------
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# .---command stderr------------
# | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir:68:12: error: CHECK: expected string not found in input
# |  // CHECK: [84, 84]
# |            ^
# | <stdin>:1:1: note: scanning from here
# | Unranked Memref base@ = 0x55c6c704cb10 rank = 1 offset = 0 sizes = [2] strides = [1] data = 
# | ^
# | <stdin>:2:1: note: possible intended match here
# | [42, 42]
# | ^
# | 
# | Input file: <stdin>
# | Check file: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |             1: Unranked Memref base@ = 0x55c6c704cb10 rank = 1 offset = 0 sizes = [2] strides = [1] data =  
# | check:68'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
# |             2: [42, 42] 
# | check:68'0     ~~~~~~~~~
# | check:68'1     ?         possible intended match
...

@nikic
Copy link
Contributor

nikic commented May 1, 2025

It looks like this has a large compile-time impact even with the option disabled: https://llvm-compile-time-tracker.com/compare.php?from=d33c6764680ed78ffe824a83b6a33c0b609bafce&to=0c7c82af230bd525bb96a54ca9480797b5fa2a42&stat=instructions:u For some reason it's particularly bad for the stage2 -O0 builds with a 4.5% regression on sqlite3 (only 1% for stage1).

@@ -2235,44 +2235,70 @@ class DISubprogram : public DILocalScope {
class DILocation : public MDNode {
friend class LLVMContextImpl;
friend class MDNode;
#ifdef EXPERIMENTAL_KEY_INSTRUCTIONS
uint64_t AtomGroup : 61;
uint8_t AtomRank : 3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to use uint64_t for both to get actual bit packing with msvc (https://c.godbolt.org/z/1f556c1zb).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that unconditionally including these in the MDNode key may be the problem.

@llvm-ci
Copy link
Collaborator

llvm-ci commented May 2, 2025

LLVM Buildbot has detected a new failure on builder lld-x86_64-win running on as-worker-93 while building llvm at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/146/builds/2827

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM-Unit :: Support/./SupportTests.exe/90/95' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe-LLVM-Unit-7508-90-95.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=95 GTEST_SHARD_INDEX=90 C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe
--

Script:
--
C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe --gtest_filter=ProgramEnvTest.CreateProcessLongPath
--
C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(160): error: Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(163): error: fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied



C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:160
Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:163
fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied




********************


@OCHyams
Copy link
Contributor Author

OCHyams commented May 2, 2025

It looks like this has a large compile-time impact even with the option disabled: https://llvm-compile-time-tracker.com/compare.php?from=d33c6764680ed78ffe824a83b6a33c0b609bafce&to=0c7c82af230bd525bb96a54ca9480797b5fa2a42&stat=instructions:u For some reason it's particularly bad for the stage2 -O0 builds with a 4.5% regression on sqlite3 (only 1% for stage1).

Thanks @nikic, that is unexpected, I'll look into this today

OCHyams added a commit to OCHyams/llvm-project that referenced this pull request May 2, 2025
Follow up to llvm#133477.

This prevents a compile time regression pointed out by @nikic
https://llvm-compile-time-tracker.com/compare.php?from=d33c6764680ed78ffe824a83b6a33c0b609bafce&to=0c7c82af230bd525bb96a54ca9480797b5fa2a42&stat=instructions:u

The additional checks in the methods seem to cause most of the regression
(rather than being a consequence of increased size due to the fields
themselves).

The additional ifdefs are somewhat ugly - but I'm not sure if we can get around
that for now?
OCHyams added a commit to OCHyams/llvm-project that referenced this pull request May 2, 2025
Follow up to llvm#133477.

This prevents a compile time regression pointed out by @nikic
https://llvm-compile-time-tracker.com/compare.php?from=d33c6764680ed78ffe824a83b6a33c0b609bafce&to=0c7c82af230bd525bb96a54ca9480797b5fa2a42&stat=instructions:u

The additional checks in the methods seem to cause most of the regression
(rather than being a consequence of increased size due to the fields
themselves).

The additional ifdefs are somewhat ugly - but I'm not sure if we can get around
that for now?
OCHyams added a commit that referenced this pull request May 6, 2025
…138292)

Follow up to #133477.

As nikic pointed out: We need to use uint64_t for both fields to get
actual bit packing with msvc (https://c.godbolt.org/z/1f556c1zb).

The cast to u8 in hash_combine prevents an increase in compile time
(+0.16% for stage1-O0-g on compile-time-tracker).
OCHyams added a commit to OCHyams/llvm-project that referenced this pull request May 6, 2025
Follow up to llvm#133477.

This prevents a compile time regression pointed out by @nikic
https://llvm-compile-time-tracker.com/compare.php?from=d33c6764680ed78ffe824a83b6a33c0b609bafce&to=0c7c82af230bd525bb96a54ca9480797b5fa2a42&stat=instructions:u

The additional checks in the methods seem to cause most of the regression
(rather than being a consequence of increased size due to the fields
themselves).

The additional ifdefs are somewhat ugly - but I'm not sure if we can get around
that for now?
OCHyams added a commit that referenced this pull request May 6, 2025
Follow up to #133477.

This prevents a compile time regression pointed out by nikic

The additional checks in the methods seem to cause most of the regression
(rather than being a consequence of increased size due to the fields
themselves).

The additional ifdefs are somewhat ugly, but will eventually be removed.
@OCHyams
Copy link
Contributor Author

OCHyams commented May 6, 2025

Fixed the compile-time issue:
https://llvm-compile-time-tracker.com/compare.php?from=562a4559ee9b637714485169102f7a2600aab008&to=6d85de88c426a40b8e5d3cfcb0ed6564c7df12be&stat=instructions%3Au (though for transparency it looks like there's +0.08 to stage2-O0-g that's unaccounted for between the commits)

OCHyams added a commit to OCHyams/llvm-project that referenced this pull request May 6, 2025
…3477)

Add AtomGroup and AtomRank to DILocation behind compile time flag
EXPERIMENTAL_KEY_INSTRUCTIONS which is controlled by cmake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS.

Add IR read-write roundtrip test in a directory that is unsupported unless the
CMake flag is enabled.

RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
OCHyams added a commit to OCHyams/llvm-project that referenced this pull request May 6, 2025
…lvm#138292)

Follow up to llvm#133477.

As nikic pointed out: We need to use uint64_t for both fields to get
actual bit packing with msvc (https://c.godbolt.org/z/1f556c1zb).

The cast to u8 in hash_combine prevents an increase in compile time
(+0.16% for stage1-O0-g on compile-time-tracker).
OCHyams added a commit to OCHyams/llvm-project that referenced this pull request May 6, 2025
Follow up to llvm#133477.

This prevents a compile time regression pointed out by nikic

The additional checks in the methods seem to cause most of the regression
(rather than being a consequence of increased size due to the fields
themselves).

The additional ifdefs are somewhat ugly, but will eventually be removed.
OCHyams added a commit to OCHyams/llvm-project that referenced this pull request May 6, 2025
…3477)

Add AtomGroup and AtomRank to DILocation behind compile time flag
EXPERIMENTAL_KEY_INSTRUCTIONS which is controlled by cmake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS.

Add IR read-write roundtrip test in a directory that is unsupported unless the
CMake flag is enabled.

RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
OCHyams added a commit to OCHyams/llvm-project that referenced this pull request May 6, 2025
…lvm#138292)

Follow up to llvm#133477.

As nikic pointed out: We need to use uint64_t for both fields to get
actual bit packing with msvc (https://c.godbolt.org/z/1f556c1zb).

The cast to u8 in hash_combine prevents an increase in compile time
(+0.16% for stage1-O0-g on compile-time-tracker).
OCHyams added a commit to OCHyams/llvm-project that referenced this pull request May 6, 2025
Follow up to llvm#133477.

This prevents a compile time regression pointed out by nikic

The additional checks in the methods seem to cause most of the regression
(rather than being a consequence of increased size due to the fields
themselves).

The additional ifdefs are somewhat ugly, but will eventually be removed.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…3477)

Add AtomGroup and AtomRank to DILocation behind compile time flag
EXPERIMENTAL_KEY_INSTRUCTIONS which is controlled by cmake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS.

Add IR read-write roundtrip test in a directory that is unsupported unless the
CMake flag is enabled.

RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…3477)

Add AtomGroup and AtomRank to DILocation behind compile time flag
EXPERIMENTAL_KEY_INSTRUCTIONS which is controlled by cmake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS.

Add IR read-write roundtrip test in a directory that is unsupported unless the
CMake flag is enabled.

RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…3477)

Add AtomGroup and AtomRank to DILocation behind compile time flag
EXPERIMENTAL_KEY_INSTRUCTIONS which is controlled by cmake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS.

Add IR read-write roundtrip test in a directory that is unsupported unless the
CMake flag is enabled.

RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request May 7, 2025
…296)

Follow up to llvm/llvm-project#133477.

This prevents a compile time regression pointed out by nikic

The additional checks in the methods seem to cause most of the regression
(rather than being a consequence of increased size due to the fields
themselves).

The additional ifdefs are somewhat ugly, but will eventually be removed.
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
…3477)

Add AtomGroup and AtomRank to DILocation behind compile time flag
EXPERIMENTAL_KEY_INSTRUCTIONS which is controlled by cmake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS.

Add IR read-write roundtrip test in a directory that is unsupported unless the
CMake flag is enabled.

RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
…lvm#138292)

Follow up to llvm#133477.

As nikic pointed out: We need to use uint64_t for both fields to get
actual bit packing with msvc (https://c.godbolt.org/z/1f556c1zb).

The cast to u8 in hash_combine prevents an increase in compile time
(+0.16% for stage1-O0-g on compile-time-tracker).
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
Follow up to llvm#133477.

This prevents a compile time regression pointed out by nikic

The additional checks in the methods seem to cause most of the regression
(rather than being a consequence of increased size due to the fields
themselves).

The additional ifdefs are somewhat ugly, but will eventually be removed.
@llvm-ci
Copy link
Collaborator

llvm-ci commented May 9, 2025

LLVM Buildbot has detected a new failure on builder clang-s390x-linux-multistage running on systemz-1 while building llvm at step 11 "ninja check 2".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/98/builds/1371

Here is the relevant piece of the build log for the reference
Step 11 (ninja check 2) failure: stage 2 checked (failure)
******************** TEST 'libFuzzer-s390x-default-Linux :: fuzzer-timeout.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/stage2/./bin/clang    -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/llvm/compiler-rt/lib/fuzzer  /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/llvm/compiler-rt/test/fuzzer/TimeoutTest.cpp -o /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/stage2/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest # RUN: at line 1
+ /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/stage2/./bin/clang -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/llvm/compiler-rt/lib/fuzzer /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/llvm/compiler-rt/test/fuzzer/TimeoutTest.cpp -o /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/stage2/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest
/home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/stage2/./bin/clang    -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/llvm/compiler-rt/lib/fuzzer  /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/llvm/compiler-rt/test/fuzzer/TimeoutEmptyTest.cpp -o /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/stage2/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutEmptyTest # RUN: at line 2
+ /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/stage2/./bin/clang -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/llvm/compiler-rt/lib/fuzzer /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/llvm/compiler-rt/test/fuzzer/TimeoutEmptyTest.cpp -o /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/stage2/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutEmptyTest
not  /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/stage2/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 2>&1 | FileCheck /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test --check-prefix=TimeoutTest # RUN: at line 3
+ not /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/stage2/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1
+ FileCheck /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test --check-prefix=TimeoutTest
not  /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/stage2/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/llvm/compiler-rt/test/fuzzer/hi.txt 2>&1 | FileCheck /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test --check-prefix=SingleInputTimeoutTest # RUN: at line 12
+ not /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/stage2/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/llvm/compiler-rt/test/fuzzer/hi.txt
+ FileCheck /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test --check-prefix=SingleInputTimeoutTest
/home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/stage2/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 -timeout_exitcode=0 # RUN: at line 16
+ /home/uweigand/sandbox/buildbot/clang-s390x-linux-multistage/stage2/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 -timeout_exitcode=0
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 267368897
INFO: Loaded 1 modules   (13 inline 8-bit counters): 13 [0x2aa38351e80, 0x2aa38351e8d), 
INFO: Loaded 1 PC tables (13 PCs): 13 [0x2aa38351e90,0x2aa38351f60), 
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: A corpus is not provided, starting from an empty corpus
#2	INITED cov: 2 ft: 2 corp: 1/1b exec/s: 0 rss: 31Mb
#333	NEW    cov: 3 ft: 3 corp: 2/3b lim: 6 exec/s: 0 rss: 31Mb L: 2/2 MS: 1 InsertByte-
#416	NEW    cov: 4 ft: 4 corp: 3/4b lim: 6 exec/s: 0 rss: 32Mb L: 1/2 MS: 3 ChangeByte-CopyPart-EraseBytes-
#1505	NEW    cov: 5 ft: 5 corp: 4/14b lim: 14 exec/s: 0 rss: 32Mb L: 10/10 MS: 4 ShuffleBytes-CrossOver-CopyPart-InsertRepeatedBytes-
#1589	REDUCE cov: 5 ft: 5 corp: 4/11b lim: 14 exec/s: 0 rss: 32Mb L: 7/7 MS: 4 InsertByte-ChangeByte-CrossOver-EraseBytes-
#1650	REDUCE cov: 5 ft: 5 corp: 4/9b lim: 14 exec/s: 0 rss: 32Mb L: 5/5 MS: 1 EraseBytes-
#1683	NEW    cov: 6 ft: 6 corp: 5/11b lim: 14 exec/s: 0 rss: 32Mb L: 2/5 MS: 3 CrossOver-EraseBytes-EraseBytes-
#1709	REDUCE cov: 6 ft: 6 corp: 5/9b lim: 14 exec/s: 0 rss: 32Mb L: 3/3 MS: 1 EraseBytes-
ALARM: working on the last Unit for 1 seconds
       and the timeout value is 1 (use -timeout=N to change)
MS: 1 InsertByte-; base unit: c74064c97231114b678d2204fde8965f65080db6
0x48,0x69,0x21,0x69,
Hi!i
artifact_prefix='./'; Test unit written to ./timeout-c07078879f59203eeb77b1e2390b60cde5634ce6
Base64: SGkhaQ==
==3575620== ERROR: libFuzzer: timeout after 1 seconds
AddressSanitizer:DEADLYSIGNAL
=================================================================
AddressSanitizer:DEADLYSIGNAL
=================================================================
AddressSanitizer: CHECK failed: asan_report.cpp:227 "((current_error_.kind)) == ((kErrorKindInvalid))" (0x1, 0x0) (tid=3575620)
    <empty stack>

MS: 1 InsertByte-; base unit: c74064c97231114b678d2204fde8965f65080db6
0x48,0x69,0x21,0x69,
Hi!i
artifact_prefix='./'; Test unit written to ./crash-c07078879f59203eeb77b1e2390b60cde5634ce6
...

Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request May 9, 2025
…3477)

Add AtomGroup and AtomRank to DILocation behind compile time flag
EXPERIMENTAL_KEY_INSTRUCTIONS which is controlled by cmake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS.

Add IR read-write roundtrip test in a directory that is unsupported unless the
CMake flag is enabled.

RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants