forked from realtime-sanitizer/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 0
Proposal for first review - Just the "back end" of the sanitizer #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…/llvm-project into radsan
cjappl
pushed a commit
that referenced
this pull request
Jul 18, 2024
This test is currently flaky on a local Windows amd64 build. The reason is that it relies on the order of `process.threads` but this order is nondeterministic: If we print lldb's inputs and outputs while running, we can see that the breakpoints are always being set correctly, and always being hit: ```sh runCmd: breakpoint set -f "main.c" -l 2 output: Breakpoint 1: where = a.out`func_inner + 1 at main.c:2:9, address = 0x0000000140001001 runCmd: breakpoint set -f "main.c" -l 7 output: Breakpoint 2: where = a.out`main + 17 at main.c:7:5, address = 0x0000000140001021 runCmd: run output: Process 52328 launched: 'C:\workspace\llvm-project\llvm\build\lldb-test-build.noindex\functionalities\unwind\zeroth_frame\TestZerothFrame.test_dwarf\a.out' (x86_64) Process 52328 stopped * thread #1, stop reason = breakpoint 1.1 frame #0: 0x00007ff68f6b1001 a.out`func_inner at main.c:2:9 1 void func_inner() { -> 2 int a = 1; // Set breakpoint 1 here ^ 3 } 4 5 int main() { 6 func_inner(); 7 return 0; // Set breakpoint 2 here ``` However, sometimes the backtrace printed in this test shows that the process is stopped inside NtWaitForWorkViaWorkerFactory from `ntdll.dll`: ```sh Backtrace at the first breakpoint: frame #0: 0x00007ffecc7b3bf4 ntdll.dll`NtWaitForWorkViaWorkerFactory + 20 frame #1: 0x00007ffecc74585e ntdll.dll`RtlClearThreadWorkOnBehalfTicket + 862 frame #2: 0x00007ffecc3e257d kernel32.dll`BaseThreadInitThunk + 29 frame #3: 0x00007ffecc76af28 ntdll.dll`RtlUserThreadStart + 40 ``` When this happens, the test fails with an assertion error that the stopped thread's zeroth frame's current line number does not match the expected line number. This is because the test is looking at the wrong thread: `process.threads[0]`. If we print the list of threads each time the test is run, we notice that threads are sometimes in a different order, within `process.threads`: ```sh Thread 0: thread #4: tid = 0x9c38, 0x00007ffecc7b3bf4 ntdll.dll`NtWaitForWorkViaWorkerFactory + 20 Thread 1: thread #2: tid = 0xa950, 0x00007ffecc7b3bf4 ntdll.dll`NtWaitForWorkViaWorkerFactory + 20 Thread 2: thread #1: tid = 0xab18, 0x00007ff64bc81001 a.out`func_inner at main.c:2:9, stop reason = breakpoint 1.1 Thread 3: thread #3: tid = 0xc514, 0x00007ffecc7b3bf4 ntdll.dll`NtWaitForWorkViaWorkerFactory + 20 Thread 0: thread #3: tid = 0x018c, 0x00007ffecc7b3bf4 ntdll.dll`NtWaitForWorkViaWorkerFactory + 20 Thread 1: thread #1: tid = 0x85c8, 0x00007ff7130c1001 a.out`func_inner at main.c:2:9, stop reason = breakpoint 1.1 Thread 2: thread #2: tid = 0xf344, 0x00007ffecc7b3bf4 ntdll.dll`NtWaitForWorkViaWorkerFactory + 20 Thread 3: thread #4: tid = 0x6a50, 0x00007ffecc7b3bf4 ntdll.dll`NtWaitForWorkViaWorkerFactory + 20 ``` Use `self.thread()` to consistently select the correct thread, instead. Co-authored-by: kendal <[email protected]>
cjappl
pushed a commit
that referenced
this pull request
Jul 18, 2024
…izations of function templates to USRGenerator (llvm#98027) Given the following: ``` template<typename T> struct A { void f(int); // #1 template<typename U> void f(U); // #2 template<> void f<int>(int); // #3 }; ``` Clang will generate the same USR for `#1` and `#2`. This patch fixes the issue by including the template arguments of dependent class scope explicit specializations in their USRs.
cjappl
pushed a commit
that referenced
this pull request
Jul 18, 2024
This patch adds a frame recognizer for Clang's `__builtin_verbose_trap`, which behaves like a `__builtin_trap`, but emits a failure-reason string into debug-info in order for debuggers to display it to a user. The frame recognizer triggers when we encounter a frame with a function name that begins with `__clang_trap_msg`, which is the magic prefix Clang emits into debug-info for verbose traps. Once such frame is encountered we display the frame function name as the `Stop Reason` and display that frame to the user. Example output: ``` (lldb) run warning: a.out was compiled with optimization - stepping may behave oddly; variables may not be available. Process 35942 launched: 'a.out' (arm64) Process 35942 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = Misc.: Function is not implemented frame #1: 0x0000000100003fa4 a.out`main [inlined] Dummy::func(this=<unavailable>) at verbose_trap.cpp:3:5 [opt] 1 struct Dummy { 2 void func() { -> 3 __builtin_verbose_trap("Misc.", "Function is not implemented"); 4 } 5 }; 6 7 int main() { (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = Misc.: Function is not implemented frame #0: 0x0000000100003fa4 a.out`main [inlined] __clang_trap_msg$Misc.$Function is not implemented$ at verbose_trap.cpp:0 [opt] * frame #1: 0x0000000100003fa4 a.out`main [inlined] Dummy::func(this=<unavailable>) at verbose_trap.cpp:3:5 [opt] frame #2: 0x0000000100003fa4 a.out`main at verbose_trap.cpp:8:13 [opt] frame #3: 0x0000000189d518b4 dyld`start + 1988 ```
cjappl
pushed a commit
that referenced
this pull request
Jul 25, 2024
…linux (llvm#99613) Examples of the output: ARM: ``` # ./a.out AddressSanitizer:DEADLYSIGNAL ================================================================= ==122==ERROR: AddressSanitizer: SEGV on unknown address 0x0000007a (pc 0x76e13ac0 bp 0x7eb7fd00 sp 0x7eb7fcc8 T0) ==122==The signal is caused by a READ memory access. ==122==Hint: address points to the zero page. #0 0x76e13ac0 (/lib/libc.so.6+0x7cac0) #1 0x76dce680 in gsignal (/lib/libc.so.6+0x37680) #2 0x005c2250 (/root/a.out+0x145250) #3 0x76db982c (/lib/libc.so.6+0x2282c) #4 0x76db9918 in __libc_start_main (/lib/libc.so.6+0x22918) ==122==Register values: r0 = 0x00000000 r1 = 0x0000007a r2 = 0x0000000b r3 = 0x76d95020 r4 = 0x0000007a r5 = 0x00000001 r6 = 0x005dcc5c r7 = 0x0000010c r8 = 0x0000000b r9 = 0x76f9ece0 r10 = 0x00000000 r11 = 0x7eb7fd00 r12 = 0x76dce670 sp = 0x7eb7fcc8 lr = 0x76e13ab4 pc = 0x76e13ac0 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV (/lib/libc.so.6+0x7cac0) ==122==ABORTING ``` AArch64: ``` # ./a.out UndefinedBehaviorSanitizer:DEADLYSIGNAL ==99==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000063 (pc 0x007fbbbc5860 bp 0x007fcfdcb700 sp 0x007fcfdcb700 T99) ==99==The signal is caused by a UNKNOWN memory access. ==99==Hint: address points to the zero page. #0 0x007fbbbc5860 (/lib64/libc.so.6+0x82860) #1 0x007fbbb81578 (/lib64/libc.so.6+0x3e578) #2 0x00556051152c (/root/a.out+0x3152c) #3 0x007fbbb6e268 (/lib64/libc.so.6+0x2b268) #4 0x007fbbb6e344 (/lib64/libc.so.6+0x2b344) #5 0x0055604e45ec (/root/a.out+0x45ec) ==99==Register values: x0 = 0x0000000000000000 x1 = 0x0000000000000063 x2 = 0x000000000000000b x3 = 0x0000007fbbb41440 x4 = 0x0000007fbbb41580 x5 = 0x3669288942d44cce x6 = 0x0000000000000000 x7 = 0x00000055605110b0 x8 = 0x0000000000000083 x9 = 0x0000000000000000 x10 = 0x0000000000000000 x11 = 0x0000000000000000 x12 = 0x0000007fbbdb3360 x13 = 0x0000000000010000 x14 = 0x0000000000000039 x15 = 0x00000000004113a0 x16 = 0x0000007fbbb81560 x17 = 0x0000005560540138 x18 = 0x000000006474e552 x19 = 0x0000000000000063 x20 = 0x0000000000000001 x21 = 0x000000000000000b x22 = 0x0000005560511510 x23 = 0x0000007fcfdcb918 x24 = 0x0000007fbbdb1b50 x25 = 0x0000000000000000 x26 = 0x0000007fbbdb2000 x27 = 0x000000556053f858 x28 = 0x0000000000000000 fp = 0x0000007fcfdcb700 lr = 0x0000007fbbbc584c sp = 0x0000007fcfdcb700 UndefinedBehaviorSanitizer can not provide additional info. SUMMARY: UndefinedBehaviorSanitizer: SEGV (/lib64/libc.so.6+0x82860) ==99==ABORTING ```
cjappl
pushed a commit
that referenced
this pull request
Aug 9, 2024
``` UBSan-Standalone-sparc :: TestCases/Misc/Linux/diag-stacktrace.cpp ``` `FAIL`s on 32 and 64-bit Linux/sparc64 (and on Solaris/sparcv9, too: the test isn't Linux-specific at all). With `UBSAN_OPTIONS=fast_unwind_on_fatal=1`, the stack trace shows a duplicate innermost frame: ``` compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:31: runtime error: execution reached the end of a value-returning function without returning a value #0 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35 #1 0x7003a708 in f() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:14:35 #2 0x7003a714 in g() compiler-rt/test/ubsan/TestCases/Misc/Linux/diag-stacktrace.cpp:17:38 ``` which isn't seen with `fast_unwind_on_fatal=0`. This turns out to be another fallout from fixing `__builtin_return_address`/`__builtin_extract_return_addr` on SPARC. In `sanitizer_stacktrace_sparc.cpp` (`BufferedStackTrace::UnwindFast`) the `pc` arg is the return address, while `pc1` from the stack frame (`fr_savpc`) is the address of the `call` insn, leading to a double entry for the innermost frame in `trace_buffer[]`. This patch fixes this by moving the adjustment before all uses. Tested on `sparc64-unknown-linux-gnu` and `sparcv9-sun-solaris2.11` (with the `ubsan/TestCases/Misc/Linux` tests enabled).
cjappl
pushed a commit
that referenced
this pull request
Aug 20, 2024
…lvm#104148) `hasOperands` does not always execute matchers in the order they are written. This can cause issue in code using bindings when one operand matcher is relying on a binding set by the other. With this change, the first matcher present in the code is always executed first and any binding it sets are available to the second matcher. Simple example with current version (1 match) and new version (2 matches): ```bash > cat tmp.cpp int a = 13; int b = ((int) a) - a; int c = a - ((int) a); > clang-query tmp.cpp clang-query> set traversal IgnoreUnlessSpelledInSource clang-query> m binaryOperator(hasOperands(cStyleCastExpr(has(declRefExpr(hasDeclaration(valueDecl().bind("d"))))), declRefExpr(hasDeclaration(valueDecl(equalsBoundNode("d")))))) Match #1: tmp.cpp:1:1: note: "d" binds here int a = 13; ^~~~~~~~~~ tmp.cpp:2:9: note: "root" binds here int b = ((int)a) - a; ^~~~~~~~~~~~ 1 match. > ./build/bin/clang-query tmp.cpp clang-query> set traversal IgnoreUnlessSpelledInSource clang-query> m binaryOperator(hasOperands(cStyleCastExpr(has(declRefExpr(hasDeclaration(valueDecl().bind("d"))))), declRefExpr(hasDeclaration(valueDecl(equalsBoundNode("d")))))) Match #1: tmp.cpp:1:1: note: "d" binds here 1 | int a = 13; | ^~~~~~~~~~ tmp.cpp:2:9: note: "root" binds here 2 | int b = ((int)a) - a; | ^~~~~~~~~~~~ Match #2: tmp.cpp:1:1: note: "d" binds here 1 | int a = 13; | ^~~~~~~~~~ tmp.cpp:3:9: note: "root" binds here 3 | int c = a - ((int)a); | ^~~~~~~~~~~~ 2 matches. ``` If this should be documented or regression tested anywhere please let me know where.
cjappl
pushed a commit
that referenced
this pull request
Aug 22, 2024
…104523) Compilers and language runtimes often use helper functions that are fundamentally uninteresting when debugging anything but the compiler/runtime itself. This patch introduces a user-extensible mechanism that allows for these frames to be hidden from backtraces and automatically skipped over when navigating the stack with `up` and `down`. This does not affect the numbering of frames, so `f <N>` will still provide access to the hidden frames. The `bt` output will also print a hint that frames have been hidden. My primary motivation for this feature is to hide thunks in the Swift programming language, but I'm including an example recognizer for `std::function::operator()` that I wished for myself many times while debugging LLDB. rdar://126629381 Example output. (Yes, my proof-of-concept recognizer could hide even more frames if we had a method that returned the function name without the return type or I used something that isn't based off regex, but it's really only meant as an example). before: ``` (lldb) thread backtrace --filtered=false * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10 frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25 frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12 frame #3: 0x0000000100003968 a.out`std::__1::__function::__alloc_func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()[abi:se200000](this=0x000000016fdff280, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:171:12 frame #4: 0x00000001000026bc a.out`std::__1::__function::__func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()(this=0x000000016fdff278, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:313:10 frame #5: 0x0000000100003c38 a.out`std::__1::__function::__value_func<int (int, int)>::operator()[abi:se200000](this=0x000000016fdff278, __args=0x000000016fdff224, __args=0x000000016fdff220) const at function.h:430:12 frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10 frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10 frame #8: 0x0000000183cdf154 dyld`start + 2476 (lldb) ``` after ``` (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10 frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25 frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12 frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10 frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10 frame #8: 0x0000000183cdf154 dyld`start + 2476 Note: Some frames were hidden by frame recognizers ```
cjappl
pushed a commit
that referenced
this pull request
Aug 26, 2024
) Currently, process of replacing bitwise operations consisting of `LSR`/`LSL` with `And` is performed by `DAGCombiner`. However, in certain cases, the `AND` generated by this process can be removed. Consider following case: ``` lsr x8, x8, llvm#56 and x8, x8, #0xfc ldr w0, [x2, x8] ret ``` In this case, we can remove the `AND` by changing the target of `LDR` to `[X2, X8, LSL #2]` and right-shifting amount change to 56 to 58. after changed: ``` lsr x8, x8, llvm#58 ldr w0, [x2, x8, lsl #2] ret ``` This patch checks to see if the `SHIFTING` + `AND` operation on load target can be optimized and optimizes it if it can.
cjappl
pushed a commit
that referenced
this pull request
Aug 29, 2024
`JITDylibSearchOrderResolver` local variable can be destroyed before completion of all callbacks. Capture it together with `Deps` in `OnEmitted` callback. Original error: ``` ==2035==ERROR: AddressSanitizer: stack-use-after-return on address 0x7bebfa155b70 at pc 0x7ff2a9a88b4a bp 0x7bec08d51980 sp 0x7bec08d51978 READ of size 8 at 0x7bebfa155b70 thread T87 (tf_xla-cpu-llvm) #0 0x7ff2a9a88b49 in operator() llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:58 #1 0x7ff2a9a88b49 in __invoke<(lambda at llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:9) &, const llvm::DenseMap<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >, llvm::DenseMapInfo<llvm::orc::JITDylib *, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > &> libcxx/include/__type_traits/invoke.h:149:25 #2 0x7ff2a9a88b49 in __call<(lambda at llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:9) &, const llvm::DenseMap<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >, llvm::DenseMapInfo<llvm::orc::JITDylib *, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > &> libcxx/include/__type_traits/invoke.h:224:5 #3 0x7ff2a9a88b49 in operator() libcxx/include/__functional/function.h:210:12 #4 0x7ff2a9a88b49 in void std::__u::__function::__policy_invoker<void (llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, ```
cjappl
pushed a commit
that referenced
this pull request
Sep 2, 2024
Static destructor can race with calls to notify and trigger tsan warning. ``` WARNING: ThreadSanitizer: data race (pid=5787) Write of size 1 at 0x55bec9df8de8 by thread T23: #0 pthread_mutex_destroy [third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1344](third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp?l=1344&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x1b12affb) (BuildId: ff25ace8b17d9863348bb1759c47246c) #1 __libcpp_recursive_mutex_destroy [third_party/crosstool/v18/stable/src/libcxx/include/__thread/support/pthread.h:91](third_party/crosstool/v18/stable/src/libcxx/include/__thread/support/pthread.h?l=91&cl=669089572):10 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x4523d4e9) (BuildId: ff25ace8b17d9863348bb1759c47246c) #2 std::__tsan::recursive_mutex::~recursive_mutex() [third_party/crosstool/v18/stable/src/libcxx/src/mutex.cpp:52](third_party/crosstool/v18/stable/src/libcxx/src/mutex.cpp?l=52&cl=669089572):11 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x4523d4e9) #3 ~SmartMutex [third_party/llvm/llvm-project/llvm/include/llvm/Support/Mutex.h:28](third_party/llvm/llvm-project/llvm/include/llvm/Support/Mutex.h?l=28&cl=669089572):11 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaedfe) (BuildId: ff25ace8b17d9863348bb1759c47246c) #4 (anonymous namespace)::PerfJITEventListener::~PerfJITEventListener() [third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp:65](third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp?l=65&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaedfe) #5 cxa_at_exit_callback_installed_at(void*) [third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:437](third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp?l=437&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x1b172cb9) (BuildId: ff25ace8b17d9863348bb1759c47246c) #6 llvm::JITEventListener::createPerfJITEventListener() [third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp:496](third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp?l=496&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcad8f5) (BuildId: ff25ace8b17d9863348bb1759c47246c) ``` ``` Previous atomic read of size 1 at 0x55bec9df8de8 by thread T192 (mutexes: write M0, write M1): #0 pthread_mutex_unlock [third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1387](third_party/llvm/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp?l=1387&cl=669089572):3 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x1b12b6bb) (BuildId: ff25ace8b17d9863348bb1759c47246c) #1 __libcpp_recursive_mutex_unlock [third_party/crosstool/v18/stable/src/libcxx/include/__thread/support/pthread.h:87](third_party/crosstool/v18/stable/src/libcxx/include/__thread/support/pthread.h?l=87&cl=669089572):10 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x4523d589) (BuildId: ff25ace8b17d9863348bb1759c47246c) #2 std::__tsan::recursive_mutex::unlock() [third_party/crosstool/v18/stable/src/libcxx/src/mutex.cpp:64](third_party/crosstool/v18/stable/src/libcxx/src/mutex.cpp?l=64&cl=669089572):11 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x4523d589) #3 unlock [third_party/llvm/llvm-project/llvm/include/llvm/Support/Mutex.h:47](third_party/llvm/llvm-project/llvm/include/llvm/Support/Mutex.h?l=47&cl=669089572):16 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaf968) (BuildId: ff25ace8b17d9863348bb1759c47246c) #4 ~lock_guard [third_party/crosstool/v18/stable/src/libcxx/include/__mutex/lock_guard.h:39](third_party/crosstool/v18/stable/src/libcxx/include/__mutex/lock_guard.h?l=39&cl=669089572):101 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaf968) #5 (anonymous namespace)::PerfJITEventListener::notifyObjectLoaded(unsigned long, llvm::object::ObjectFile const&, llvm::RuntimeDyld::LoadedObjectInfo const&) [third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp:290](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/PerfJITEvents/PerfJITEventListener.cpp?l=290&cl=669089572):1 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bcaf968) #6 llvm::orc::RTDyldObjectLinkingLayer::onObjEmit(llvm::orc::MaterializationResponsibility&, llvm::object::OwningBinary<llvm::object::ObjectFile>, std::__tsan::unique_ptr<llvm::RuntimeDyld::MemoryManager, std::__tsan::default_delete<llvm::RuntimeDyld::MemoryManager>>, std::__tsan::unique_ptr<llvm::RuntimeDyld::LoadedObjectInfo, std::__tsan::default_delete<llvm::RuntimeDyld::LoadedObjectInfo>>, std::__tsan::unique_ptr<llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>, llvm::DenseMapInfo<llvm::orc::JITDylib*, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>>>, std::__tsan::default_delete<llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>, llvm::DenseMapInfo<llvm::orc::JITDylib*, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void>>>>>>, llvm::Error) [third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:386](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp?l=386&cl=669089572):10 (be1eb158bb70fc9cf7be2db70407e512890e5c6e20720cd88c69d7d9c26ea531_0200d5f71908+0x2bc404a8) (BuildId: ff25ace8b17d9863348bb1759c47246c) ```
cjappl
pushed a commit
that referenced
this pull request
Sep 9, 2024
…llvm#94981) This extends default argument deduction to cover class templates as well, applying only to partial ordering, adding to the provisional wording introduced in llvm#89807. This solves some ambuguity introduced in P0522 regarding how template template parameters are partially ordered, and should reduce the negative impact of enabling `-frelaxed-template-template-args` by default. Given the following example: ```C++ template <class T1, class T2 = float> struct A; template <class T3> struct B; template <template <class T4> class TT1, class T5> struct B<TT1<T5>>; // #1 template <class T6, class T7> struct B<A<T6, T7>>; // #2 template struct B<A<int>>; ``` Prior to P0522, `#2` was picked. Afterwards, this became ambiguous. This patch restores the pre-P0522 behavior, `#2` is picked again.
cjappl
pushed a commit
that referenced
this pull request
Sep 18, 2024
When SPARC Asan testing is enabled by PR llvm#107405, many Linux/sparc64 tests just hang like ``` #0 0xf7ae8e90 in syscall () from /usr/lib32/libc.so.6 #1 0x701065e8 in __sanitizer::FutexWait(__sanitizer::atomic_uint32_t*, unsigned int) () at compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp:766 #2 0x70107c90 in Wait () at compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp:35 #3 0x700f7cac in Lock () at compiler-rt/lib/asan/../sanitizer_common/sanitizer_mutex.h:196 #4 Lock () at compiler-rt/lib/asan/../sanitizer_common/sanitizer_thread_registry.h:98 #5 LockThreads () at compiler-rt/lib/asan/asan_thread.cpp:489 #6 0x700e9c8c in __asan::BeforeFork() () at compiler-rt/lib/asan/asan_posix.cpp:157 #7 0xf7ac83f4 in ?? () from /usr/lib32/libc.so.6 Backtrace stopped: previous frame identical to this frame (corrupt stack?) ``` It turns out that this happens in tests using `internal_fork` (e.g. invoking `llvm-symbolizer`): unlike most other Linux targets, which use `clone`, Linux/sparc64 has to use `__fork` instead. While `clone` doesn't trigger `pthread_atfork` handlers, `__fork` obviously does, causing the hang. To avoid this, this patch disables `InstallAtForkHandler` and lets the ASan tests run to completion. Tested on `sparc64-unknown-linux-gnu`.
cjappl
pushed a commit
that referenced
this pull request
Sep 20, 2024
…ap (llvm#108825) This attempts to improve user-experience when LLDB stops on a verbose_trap. Currently if a `__builtin_verbose_trap` triggers, we display the first frame above the call to the verbose_trap. So in the newly added test case, we would've previously stopped here: ``` (lldb) run Process 28095 launched: '/Users/michaelbuch/a.out' (arm64) Process 28095 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = Bounds error: out-of-bounds access frame #1: 0x0000000100003f5c a.out`std::__1::vector<int>::operator[](this=0x000000016fdfebef size=0, (null)=10) at verbose_trap.cpp:6:9 3 template <typename T> 4 struct vector { 5 void operator[](unsigned) { -> 6 __builtin_verbose_trap("Bounds error", "out-of-bounds access"); 7 } 8 }; ``` After this patch, we would stop in the first non-`std` frame: ``` (lldb) run Process 27843 launched: '/Users/michaelbuch/a.out' (arm64) Process 27843 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = Bounds error: out-of-bounds access frame #2: 0x0000000100003f44 a.out`g() at verbose_trap.cpp:14:5 11 12 void g() { 13 std::vector<int> v; -> 14 v[10]; 15 } 16 ``` rdar://134490328
cjappl
pushed a commit
that referenced
this pull request
Oct 7, 2024
…ext is not fully initialized (llvm#110481) As this comment around target initialization implies: ``` // This can be NULL if we don't know anything about the architecture or if // the target for an architecture isn't enabled in the llvm/clang that we // built ``` There are cases where we might fail to call `InitBuiltinTypes` when creating the backing `ASTContext` for a `TypeSystemClang`. If that happens, the builtins `QualType`s, e.g., `VoidPtrTy`/`IntTy`/etc., are not initialized and dereferencing them as we do in `GetBuiltinTypeForEncodingAndBitSize` (and other places) will lead to nullptr-dereferences. Example backtrace: ``` (lldb) run Assertion failed: (!isNull() && "Cannot retrieve a NULL type pointer"), function getCommonPtr, file Type.h, line 958. Process 2680 stopped * thread realtime-sanitizer#15, name = '<lldb.process.internal-state(pid=2712)>', stop reason = hit program assert frame #4: 0x000000010cdf3cdc liblldb.20.0.0git.dylib`DWARFASTParserClang::ExtractIntFromFormValue(lldb_private::CompilerType const&, lldb_private::plugin::dwarf::DWARFFormValue const&) const (.cold.1) + liblldb.20.0.0git.dylib`DWARFASTParserClang::ParseObjCMethod(lldb_private::ObjCLanguage::MethodName const&, lldb_private::plugin::dwarf::DWARFDIE const&, lldb_private::CompilerType, ParsedDWARFTypeAttributes , bool) (.cold.1): -> 0x10cdf3cdc <+0>: stp x29, x30, [sp, #-0x10]! 0x10cdf3ce0 <+4>: mov x29, sp 0x10cdf3ce4 <+8>: adrp x0, 545 0x10cdf3ce8 <+12>: add x0, x0, #0xa25 ; "ParseObjCMethod" Target 0: (lldb) stopped. (lldb) bt * thread realtime-sanitizer#15, name = '<lldb.process.internal-state(pid=2712)>', stop reason = hit program assert frame #0: 0x0000000180d08600 libsystem_kernel.dylib`__pthread_kill + 8 frame #1: 0x0000000180d40f50 libsystem_pthread.dylib`pthread_kill + 288 frame #2: 0x0000000180c4d908 libsystem_c.dylib`abort + 128 frame #3: 0x0000000180c4cc1c libsystem_c.dylib`__assert_rtn + 284 * frame #4: 0x000000010cdf3cdc liblldb.20.0.0git.dylib`DWARFASTParserClang::ExtractIntFromFormValue(lldb_private::CompilerType const&, lldb_private::plugin::dwarf::DWARFFormValue const&) const (.cold.1) + frame #5: 0x0000000109d30acc liblldb.20.0.0git.dylib`lldb_private::TypeSystemClang::GetBuiltinTypeForEncodingAndBitSize(lldb::Encoding, unsigned long) + 1188 frame #6: 0x0000000109aaaed4 liblldb.20.0.0git.dylib`DynamicLoaderMacOS::NotifyBreakpointHit(void*, lldb_private::StoppointCallbackContext*, unsigned long long, unsigned long long) + 384 ``` This patch adds a one-time user-visible warning for when we fail to initialize the AST to indicate that initialization went wrong for the given target. Additionally, we add checks for whether one of the `ASTContext` `QualType`s is invalid before dereferencing any builtin types. The warning would look as follows: ``` (lldb) target create "a.out" Current executable set to 'a.out' (arm64). (lldb) b main warning: Failed to initialize builtin ASTContext types for target 'some-unknown-triple'. Printing variables may behave unexpectedly. Breakpoint 1: where = a.out`main + 8 at stepping.cpp:5:14, address = 0x0000000100003f90 ``` rdar://134869779
cjappl
pushed a commit
that referenced
this pull request
Oct 7, 2024
Fixes llvm#102703. https://godbolt.org/z/nfj8xsb1Y The following pattern: ``` %2 = and i32 %0, 254 %3 = icmp eq i32 %2, 0 ``` is optimised by instcombine into: ```%3 = icmp ult i32 %0, 2``` However, post instcombine leads to worse aarch64 than the unoptimised version. Pre instcombine: ``` tst w0, #0xfe cset w0, eq ret ``` Post instcombine: ``` and w8, w0, #0xff cmp w8, #2 cset w0, lo ret ``` In the unoptimised version, SelectionDAG converts `SETCC (AND X 254) 0 EQ` into `CSEL 0 1 1 (ANDS X 254)`, which gets emitted as a `tst`. In the optimised version, SelectionDAG converts `SETCC (AND X 255) 2 ULT` into `CSEL 0 1 2 (SUBS (AND X 255) 2)`, which gets emitted as an `and`/`cmp`. This PR adds an optimisation to `AArch64ISelLowering`, converting `SETCC (AND X Y) Z ULT` into `SETCC (AND X (Y & ~(Z - 1))) 0 EQ` when `Z` is a power of two. This makes SelectionDAG/Codegen produce the same optimised code for both examples.
cjappl
pushed a commit
that referenced
this pull request
Oct 30, 2024
…ates explicitly specialized for an implicitly instantiated class template specialization (llvm#113464) Consider the following: ``` template<typename T> struct A { template<typename U> struct B { static constexpr int x = 0; // #1 }; template<typename U> struct B<U*> { static constexpr int x = 1; // #2 }; }; template<> template<typename U> struct A<long>::B { static constexpr int x = 2; // #3 }; static_assert(A<short>::B<int>::y == 0); // uses #1 static_assert(A<short>::B<int*>::y == 1); // uses #2 static_assert(A<long>::B<int>::y == 2); // uses #3 static_assert(A<long>::B<int*>::y == 2); // uses #3 ``` According to [temp.spec.partial.member] p2: > If the primary member template is explicitly specialized for a given (implicit) specialization of the enclosing class template, the partial specializations of the member template are ignored for this specialization of the enclosing class template. If a partial specialization of the member template is explicitly specialized for a given (implicit) specialization of the enclosing class template, the primary member template and its other partial specializations are still considered for this specialization of the enclosing class template. The example above fails to compile because we currently don't implement [temp.spec.partial.member] p2. This patch implements the wording, fixing llvm#51051.
cjappl
pushed a commit
that referenced
this pull request
Nov 19, 2024
… depobj construct (llvm#114221) A codegen crash is occurring when a depend object was initialized with omp_all_memory in the depobj directive. llvm#114214 The root cause of issue looks to be the improper handling of the dependency list when omp_all_memory was specified. The change introduces the use of OMPTaskDataTy to manage dependencies. The buildDependences function is called to construct the dependency list, and the list is iterated over to emit and store the dependencies. Reduced Test Case : ``` #include <omp.h> int main() { omp_depend_t obj; #pragma omp depobj(obj) depend(inout: omp_all_memory) } ``` ``` #1 0x0000000003de6623 SignalHandler(int) Signals.cpp:0:0 #2 0x00007f8e4a6b990f (/lib64/libpthread.so.0+0x1690f) #3 0x00007f8e4a117d2a raise (/lib64/libc.so.6+0x4ad2a) #4 0x00007f8e4a1193e4 abort (/lib64/libc.so.6+0x4c3e4) #5 0x00007f8e4a10fc69 __assert_fail_base (/lib64/libc.so.6+0x42c69) #6 0x00007f8e4a10fcf1 __assert_fail (/lib64/libc.so.6+0x42cf1) #7 0x0000000004114367 clang::CodeGen::CodeGenFunction::EmitOMPDepobjDirective(clang::OMPDepobjDirective const&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x4114367) #8 0x00000000040f8fac clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x40f8fac) #9 0x00000000040ff4fb clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x40ff4fb) #10 0x00000000041847b2 clang::CodeGen::CodeGenFunction::EmitFunctionBody(clang::Stmt const*) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x41847b2) #11 0x0000000004199e4a clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function*, clang::CodeGen::CGFunctionInfo const&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x4199e4a) #12 0x00000000041f7b9d clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x41f7b9d) #13 0x00000000041f16a3 clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x41f16a3) #14 0x00000000041fd954 clang::CodeGen::CodeGenModule::EmitDeferred() (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x41fd954) realtime-sanitizer#15 0x0000000004200277 clang::CodeGen::CodeGenModule::Release() (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x4200277) realtime-sanitizer#16 0x00000000046b6a49 (anonymous namespace)::CodeGeneratorImpl::HandleTranslationUnit(clang::ASTContext&) ModuleBuilder.cpp:0:0 realtime-sanitizer#17 0x00000000046b4cb6 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x46b4cb6) realtime-sanitizer#18 0x0000000006204d5c clang::ParseAST(clang::Sema&, bool, bool) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x6204d5c) realtime-sanitizer#19 0x000000000496b278 clang::FrontendAction::Execute() (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x496b278) realtime-sanitizer#20 0x00000000048dd074 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x48dd074) realtime-sanitizer#21 0x0000000004a38092 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0x4a38092) realtime-sanitizer#22 0x0000000000fd4e9c cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0xfd4e9c) realtime-sanitizer#23 0x0000000000fcca73 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0 realtime-sanitizer#24 0x0000000000fd140c clang_main(int, char**, llvm::ToolContext const&) (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0xfd140c) realtime-sanitizer#25 0x0000000000ee2ef3 main (/opt/cray/pe/cce/18.0.1/cce-clang/x86_64/bin/clang-18+0xee2ef3) realtime-sanitizer#26 0x00007f8e4a10224c __libc_start_main (/lib64/libc.so.6+0x3524c) realtime-sanitizer#27 0x0000000000fcaae9 _start /home/abuild/rpmbuild/BUILD/glibc-2.31/csu/../sysdeps/x86_64/start.S:120:0 clang: error: unable to execute command: Aborted ``` --------- Co-authored-by: Chandra Ghale <[email protected]>
cjappl
pushed a commit
that referenced
this pull request
Nov 20, 2024
…onger cause a crash (llvm#116569) This PR fixes a bug introduced by llvm#110199, which causes any half float argument to crash the compiler on MIPS64. Currently compiling this bit of code with `llc -mtriple=mips64`: ``` define void @half_args(half %a) nounwind { entry: ret void } ``` Crashes with the following log: ``` LLVM ERROR: unable to allocate function argument #0 PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. Program arguments: llc -mtriple=mips64 1. Running pass 'Function Pass Manager' on module '<stdin>'. 2. Running pass 'MIPS DAG->DAG Pattern Instruction Selection' on function '@half_args' #0 0x000055a3a4013df8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x32d0df8) #1 0x000055a3a401199e llvm::sys::RunSignalHandlers() (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x32ce99e) #2 0x000055a3a40144a8 SignalHandler(int) Signals.cpp:0:0 #3 0x00007f00bde558c0 __restore_rt libc_sigaction.c:0:0 #4 0x00007f00bdea462c __pthread_kill_implementation ./nptl/pthread_kill.c:44:76 #5 0x00007f00bde55822 gsignal ./signal/../sysdeps/posix/raise.c:27:6 #6 0x00007f00bde3e4af abort ./stdlib/abort.c:81:7 #7 0x000055a3a3f80e3c llvm::report_fatal_error(llvm::Twine const&, bool) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x323de3c) #8 0x000055a3a2e20dfa (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x20dddfa) #9 0x000055a3a2a34e20 llvm::MipsTargetLowering::LowerFormalArguments(llvm::SDValue, unsigned int, bool, llvm::SmallVectorImpl<llvm::ISD::InputArg> const&, llvm::SDLoc const&, llvm::SelectionDAG&, llvm::SmallVectorImpl<llvm::SDValue>&) const MipsISelLowering.cpp:0:0 #10 0x000055a3a3d896a9 llvm::SelectionDAGISel::LowerArguments(llvm::Function const&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x30466a9) #11 0x000055a3a3e0b3ec llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x30c83ec) #12 0x000055a3a3e09e21 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x30c6e21) #13 0x000055a3a2aae1ca llvm::MipsDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) MipsISelDAGToDAG.cpp:0:0 #14 0x000055a3a3e07706 llvm::SelectionDAGISelLegacy::runOnMachineFunction(llvm::MachineFunction&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x30c4706) realtime-sanitizer#15 0x000055a3a3051ed6 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x230eed6) realtime-sanitizer#16 0x000055a3a35a3ec9 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x2860ec9) realtime-sanitizer#17 0x000055a3a35ac3b2 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x28693b2) realtime-sanitizer#18 0x000055a3a35a499c llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x286199c) realtime-sanitizer#19 0x000055a3a262abbb main (/home/davide/Ps2/rps2-tools/prefix/bin/llc+0x18e7bbb) realtime-sanitizer#20 0x00007f00bde3fc4c __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:74:3 realtime-sanitizer#21 0x00007f00bde3fd05 call_init ./csu/../csu/libc-start.c:128:20 realtime-sanitizer#22 0x00007f00bde3fd05 __libc_start_main@GLIBC_2.2.5 ./csu/../csu/libc-start.c:347:5 realtime-sanitizer#23 0x000055a3a2624921 _start /builddir/glibc-2.39/csu/../sysdeps/x86_64/start.S:117:0 ``` This is caused by the fact that after the change, `f16`s are no longer lowered as `f32`s in calls. Two possible fixes are available: - Update calling conventions to properly support passing `f16` as integers. - Update `useFPRegsForHalfType()` to return `true` so that `f16` are still kept in `f32` registers, as before llvm#110199. This PR implements the first solution to not introduce any more ABI changes as llvm#110199 already did. As of what is the correct ABI for halfs, I don't think there is a correct answer. GCC doesn't support halfs on MIPS, and I couldn't find any information on old MIPS ABI manuals either.
cjappl
pushed a commit
that referenced
this pull request
Nov 21, 2024
…#116656) The main issue to solve is that OpenMP modifiers can be specified in any order, so the parser cannot expect any specific modifier at a given position. To solve that, define modifier to be a union of all allowable specific modifiers for a given clause. Additionally, implement modifier descriptors: for each modifier the corresponding descriptor contains a set of properties of the modifier that allow a common set of semantic checks. Start with the syntactic properties defined in the spec: Required, Unique, Exclusive, Ultimate, and implement common checks to verify each of them. OpenMP modifier overhaul: #2/3
cjappl
pushed a commit
that referenced
this pull request
Nov 29, 2024
…plementation (llvm#108413. llvm#117704) (llvm#117894) Relands llvm#117704, which relanded changes from llvm#108413 - this was reverted due to build issues. The new offload library did not build with `LIBOMPTARGET_OMPT_SUPPORT` enabled, which was not picked up by pre-merge testing. The last commit contains the fix; everything else is otherwise identical to the approved PR. ___ ### New API Previous discussions at the LLVM/Offload meeting have brought up the need for a new API for exposing the functionality of the plugins. This change introduces a very small subset of a new API, which is primarily for testing the offload tooling and demonstrating how a new API can fit into the existing code base without being too disruptive. Exact designs for these entry points and future additions can be worked out over time. The new API does however introduce the bare minimum functionality to implement device discovery for Unified Runtime and SYCL. This means that the `urinfo` and `sycl-ls` tools can be used on top of Offload. A (rough) implementation of a Unified Runtime adapter (aka plugin) for Offload is available [here](https://github.com/callumfare/unified-runtime/tree/offload_adapter). Our intention is to maintain this and use it to implement and test Offload API changes with SYCL. ### Demoing the new API ```sh # From the runtime build directory $ ninja LibomptUnitTests $ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests ``` ### Open questions and future work * Only some of the available device info is exposed, and not all the possible device queries needed for SYCL are implemented by the plugins. A sensible next step would be to refactor and extend the existing device info queries in the plugins. The existing info queries are all strings, but the new API introduces the ability to return any arbitrary type. * It may be sensible at some point for the plugins to implement the new API directly, and the higher level code on top of it could be made generic, but this is more of a long-term possibility.
cjappl
pushed a commit
that referenced
this pull request
Nov 29, 2024
…abort (llvm#117603) Hey guys, I found that Flang's built-in ABORT function is incomplete when I was using it. Compared with gfortran's ABORT (which can both abort and print out a backtrace), flang's ABORT implementation lacks the function of printing out a backtrace. This feature is essential for debugging and understanding the call stack at the failure point. To solve this problem, I completed the "// TODO:" of the abort function, and then implemented an additional built-in function BACKTRACE for flang. After a brief reading of the relevant source code, I used backtrace and backtrace_symbols in "execinfo.h" to quickly implement this. But since I used the above two functions directly, my implementation is slightly different from gfortran's implementation (in the output, the function call stack before main is additionally output, and the function line number is missing). In addition, since I used the above two functions, I did not need to add -g to embed debug information into the ELF file, but needed -rdynamic to ensure that the symbols are added to the dynamic symbol table (so that the function name will be printed out). Here is a comparison of the output between gfortran 's backtrace and my implementation: gfortran's implemention output: ``` #0 0x557eb71f4184 in testfun2_ at /home/hunter/plct/fortran/test.f90:5 #1 0x557eb71f4165 in testfun1_ at /home/hunter/plct/fortran/test.f90:13 #2 0x557eb71f4192 in test_backtrace at /home/hunter/plct/fortran/test.f90:17 #3 0x557eb71f41ce in main at /home/hunter/plct/fortran/test.f90:18 ``` my impelmention output: ``` Backtrace: #0 ./test(_FortranABacktrace+0x32) [0x574f07efcf92] #1 ./test(testfun2_+0x14) [0x574f07efc7b4] #2 ./test(testfun1_+0xd) [0x574f07efc7cd] #3 ./test(_QQmain+0x9) [0x574f07efc7e9] #4 ./test(main+0x12) [0x574f07efc802] #5 /usr/lib/libc.so.6(+0x25e08) [0x76954694fe08] #6 /usr/lib/libc.so.6(__libc_start_main+0x8c) [0x76954694fecc] #7 ./test(_start+0x25) [0x574f07efc6c5] ``` test program is: ``` function testfun2() result(err) implicit none integer :: err err = 1 call backtrace end function testfun2 subroutine testfun1() implicit none integer :: err integer :: testfun2 err = testfun2() end subroutine testfun1 program test_backtrace call testfun1() end program test_backtrace ``` I am well aware of the importance of line numbers, so I am now working on implementing line numbers (by parsing DWARF information) and supporting cross-platform (Windows) support.
cjappl
pushed a commit
that referenced
this pull request
Nov 29, 2024
…w API implementation (llvm#108413. llvm#117704)" (llvm#117995) Reverts llvm#117894 Buildbot failures in OpenMP/Offload bots. https://lab.llvm.org/buildbot/#/builders/30/builds/11193
cjappl
pushed a commit
that referenced
this pull request
Dec 6, 2024
…ne symbol size as symbols are created (llvm#117079)" This reverts commit ba668eb. Below test started failing again on x86_64 macOS CI. We're unsure if this patch is the exact cause, but since this patch has broken this test before, we speculatively revert it to see if it was indeed the root cause. ``` FAIL: lldb-shell :: Unwind/trap_frame_sym_ctx.test (1692 of 2162) ******************** TEST 'lldb-shell :: Unwind/trap_frame_sym_ctx.test' FAILED ******************** Exit Code: 1 Command Output (stderr): -- RUN: at line 7: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/clang --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-apple-darwin22.6.0 -isysroot /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/call-asm.c /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/trap_frame_sym_ctx.s -o /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp + /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/clang --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-apple-darwin22.6.0 -isysroot /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/call-asm.c /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/trap_frame_sym_ctx.s -o /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp clang: warning: argument unused during compilation: '-fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell' [-Wunused-command-line-argument] RUN: at line 8: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/lldb --no-lldbinit -S /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/lit-lldb-init-quiet /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp -s /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -o exit | /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/FileCheck /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test + /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/lldb --no-lldbinit -S /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/lit-lldb-init-quiet /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp -s /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -o exit + /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/FileCheck /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test:21:10: error: CHECK: expected string not found in input ^ <stdin>:26:64: note: scanning from here frame #1: 0x0000000100003ee9 trap_frame_sym_ctx.test.tmp`tramp ^ <stdin>:27:2: note: possible intended match here frame #2: 0x00007ff7bfeff6c0 ^ Input file: <stdin> Check file: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -dump-input=help explains the following input dump. Input was: <<<<<< . . . 21: 0x100003ed1 <+0>: pushq %rbp 22: 0x100003ed2 <+1>: movq %rsp, %rbp 23: (lldb) thread backtrace -u 24: * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 25: * frame #0: 0x0000000100003ecc trap_frame_sym_ctx.test.tmp`bar 26: frame #1: 0x0000000100003ee9 trap_frame_sym_ctx.test.tmp`tramp check:21'0 X error: no match found 27: frame #2: 0x00007ff7bfeff6c0 check:21'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ check:21'1 ? possible intended match 28: frame #3: 0x0000000100003ec6 trap_frame_sym_ctx.test.tmp`main + 22 check:21'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 29: frame #4: 0x0000000100003ec6 trap_frame_sym_ctx.test.tmp`main + 22 check:21'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 30: frame #5: 0x00007ff8193cc41f dyld`start + 1903 check:21'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 31: (lldb) exit check:21'0 ~~~~~~~~~~~~ >>>>>> ```
cjappl
pushed a commit
that referenced
this pull request
Dec 6, 2024
## Description This PR fixes a segmentation fault that occurs when passing options requiring arguments via `-Xopenmp-target=<triple>`. The issue was that the function `Driver::getOffloadArchs` did not properly parse the extracted option, but instead assumed it was valid, leading to a crash when incomplete arguments were provided. ## Backtrace ```sh llvm-project/build/bin/clang++ main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -Xopenmp-target=powerpc64le-ibm-linux-gnu -o PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: llvm-project/build/bin/clang++ main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -Xopenmp-target=powerpc64le-ibm-linux-gnu -o 1. Compilation construction 2. Building compilation actions #0 0x0000562fb21c363b llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (llvm-project/build/bin/clang+++0x392f63b) #1 0x0000562fb21c0e3c SignalHandler(int) Signals.cpp:0:0 #2 0x00007fcbf6c81420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420) #3 0x0000562fb1fa5d70 llvm::opt::Option::matches(llvm::opt::OptSpecifier) const (llvm-project/build/bin/clang+++0x3711d70) #4 0x0000562fb2a78e7d clang::driver::Driver::getOffloadArchs(clang::driver::Compilation&, llvm::opt::DerivedArgList const&, clang::driver::Action::OffloadKind, clang::driver::ToolChain const*, bool) const (llvm-project/build/bin/clang+++0x41e4e7d) #5 0x0000562fb2a7a9aa clang::driver::Driver::BuildOffloadingActions(clang::driver::Compilation&, llvm::opt::DerivedArgList&, std::pair<clang::driver::types::ID, llvm::opt::Arg const*> const&, clang::driver::Action*) const (.part.1164) Driver.cpp:0:0 #6 0x0000562fb2a7c093 clang::driver::Driver::BuildActions(clang::driver::Compilation&, llvm::opt::DerivedArgList&, llvm::SmallVector<std::pair<clang::driver::types::ID, llvm::opt::Arg const*>, 16u> const&, llvm::SmallVector<clang::driver::Action*, 3u>&) const (llvm-project/build/bin/clang+++0x41e8093) #7 0x0000562fb2a8395d clang::driver::Driver::BuildCompilation(llvm::ArrayRef<char const*>) (llvm-project/build/bin/clang+++0x41ef95d) #8 0x0000562faf92684c clang_main(int, char**, llvm::ToolContext const&) (llvm-project/build/bin/clang+++0x109284c) #9 0x0000562faf826cc6 main (llvm-project/build/bin/clang+++0xf92cc6) #10 0x00007fcbf6699083 __libc_start_main /build/glibc-LcI20x/glibc-2.31/csu/../csu/libc-start.c:342:3 #11 0x0000562faf923a5e _start (llvm-project/build/bin/clang+++0x108fa5e) [1] 2628042 segmentation fault (core dumped) main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -o ```
cjappl
pushed a commit
that referenced
this pull request
Dec 7, 2024
llvm#118923) …d reentry. These utilities provide new, more generic and easier to use support for lazy compilation in ORC. LazyReexportsManager is an alternative to LazyCallThroughManager. It takes requests for lazy re-entry points in the form of an alias map: lazy-reexports = { ( <entry point symbol #1>, <implementation symbol #1> ), ( <entry point symbol #2>, <implementation symbol #2> ), ... ( <entry point symbol #n>, <implementation symbol #n> ) } LazyReexportsManager then: 1. binds the entry points to the implementation names in an internal table. 2. creates a JIT re-entry trampoline for each entry point. 3. creates a redirectable symbol for each of the entry point name and binds redirectable symbol to the corresponding reentry trampoline. When an entry point symbol is first called at runtime (which may be on any thread of the JIT'd program) it will re-enter the JIT via the trampoline and trigger a lookup for the implementation symbol stored in LazyReexportsManager's internal table. When the lookup completes the entry point symbol will be updated (via the RedirectableSymbolManager) to point at the implementation symbol, and execution will proceed to the implementation symbol. Actual construction of the re-entry trampolines and redirectable symbols is delegated to an EmitTrampolines functor and the RedirectableSymbolsManager respectively. JITLinkReentryTrampolines.h provides a JITLink-based implementation of the EmitTrampolines functor. (AArch64 only in this patch, but other architectures will be added in the near future). Register state save and reentry functionality is added to the ORC runtime in the __orc_rt_sysv_resolve and __orc_rt_resolve_implementation functions (the latter is generic, the former will need custom implementations for each ABI and architecture to be supported, however this should be much less effort than the existing OrcABISupport approach, since the ORC runtime allows this code to be written as native assembly). The resulting system: 1. Works equally well for in-process and out-of-process JIT'd code. 2. Requires less boilerplate to set up. Given an ObjectLinkingLayer and PlatformJD (JITDylib containing the ORC runtime), setup is just: ```c++ auto RSMgr = JITLinkRedirectableSymbolManager::Create(OLL); if (!RSMgr) return RSMgr.takeError(); auto LRMgr = createJITLinkLazyReexportsManager(OLL, **RSMgr, PlatformJD); if (!LRMgr) return LRMgr.takeError(); ``` after which lazy reexports can be introduced with: ```c++ JD.define(lazyReexports(LRMgr, <alias map>)); ``` LazyObectLinkingLayer is updated to use this new method, but the LLVM-IR level CompileOnDemandLayer will continue to use LazyCallThroughManager and OrcABISupport until the new system supports a wider range of architectures and ABIs. The llvm-jitlink utility's -lazy option now uses the new scheme. Since it depends on the ORC runtime, the lazy-link.ll testcase and associated helpers are moved to the ORC runtime.
cjappl
pushed a commit
that referenced
this pull request
Dec 12, 2024
The Clang binary (and any binary linking Clang as a library), when built using PIE, ends up with a pretty shocking number of dynamic relocations to apply to the executable image: roughly 400k. Each of these takes up binary space in the executable, and perhaps most interestingly takes start-up time to apply the relocations. The largest pattern I identified were the strings used to describe target builtins. The addresses of these string literals were stored into huge arrays, each one requiring a dynamic relocation. The way to avoid this is to design the target builtins to use a single large table of strings and offsets within the table for the individual strings. This switches the builtin management to such a scheme. This saves over 100k dynamic relocations by my measurement, an over 25% reduction. Just looking at byte size improvements, using the `bloaty` tool to compare a newly built `clang` binary to an old one: ``` FILE SIZE VM SIZE -------------- -------------- +1.4% +653Ki +1.4% +653Ki .rodata +0.0% +960 +0.0% +960 .text +0.0% +197 +0.0% +197 .dynstr +0.0% +184 +0.0% +184 .eh_frame +0.0% +96 +0.0% +96 .dynsym +0.0% +40 +0.0% +40 .eh_frame_hdr +114% +32 [ = ] 0 [Unmapped] +0.0% +20 +0.0% +20 .gnu.hash +0.0% +8 +0.0% +8 .gnu.version +0.9% +7 +0.9% +7 [LOAD #2 [R]] [ = ] 0 -75.4% -3.00Ki .relro_padding -16.1% -802Ki -16.1% -802Ki .data.rel.ro -27.3% -2.52Mi -27.3% -2.52Mi .rela.dyn -1.6% -2.66Mi -1.6% -2.66Mi TOTAL ``` We get a 16% reduction in the `.data.rel.ro` section, and nearly 30% reduction in `.rela.dyn` where those reloctaions are stored. This is also visible in my benchmarking of binary start-up overhead at least: ``` Benchmark 1: ./old_clang --version Time (mean ± σ): 17.6 ms ± 1.5 ms [User: 4.1 ms, System: 13.3 ms] Range (min … max): 14.2 ms … 22.8 ms 162 runs Benchmark 2: ./new_clang --version Time (mean ± σ): 15.5 ms ± 1.4 ms [User: 3.6 ms, System: 11.8 ms] Range (min … max): 12.4 ms … 20.3 ms 216 runs Summary './new_clang --version' ran 1.13 ± 0.14 times faster than './old_clang --version' ``` We get about 2ms faster `--version` runs. While there is a lot of noise in binary execution time, this delta is pretty consistent, and represents over 10% improvement. This is particularly interesting to me because for very short source files, repeatedly starting the `clang` binary is actually the dominant cost. For example, `configure` scripts running against the `clang` compiler are slow in large part because of binary start up time, not the time to process the actual inputs to the compiler. ---- This PR implements the string tables using `constexpr` code and the existing macro system. I understand that the builtins are moving towards a TableGen model, and if complete that would provide more options for modeling this. Unfortunately, that migration isn't complete, and even the parts that are migrated still rely on the ability to break out of the TableGen model and directly expand an X-macro style `BUILTIN(...)` textually. I looked at trying to complete the move to TableGen, but it would both require the difficult migration of the remaining targets, and solving some tricky problems with how to move away from any macro-based expansion. I was also able to find a reasonably clean and effective way of doing this with the existing macros and some `constexpr` code that I think is clean enough to be a pretty good intermediate state, and maybe give a good target for the eventual TableGen solution. I was also able to factor the macros into set of consistent patterns that avoids a significant regression in overall boilerplate.
cjappl
pushed a commit
that referenced
this pull request
Aug 11, 2025
Tracked at llvm#112294 This patch implements from [basic.link]p14 to [basic.link]p18 partially. The explicitly missing parts are: - Anything related to specializations. - Decide if a pointer is associated with a TU-local value at compile time. - [basic.link]p15.1.2 to decide if a type is TU-local. - Diagnose if TU-local functions from other TU are collected to the overload set. See [basic.link]p19, the call to 'h(N::A{});' in translation unit #2 There should be other implicitly missing parts as the wording uses "names" briefly several times. But to implement this precisely, we have to visit the whole AST, including Decls, Expression and Types, which may be harder to implement and be more time-consuming for compilation time. So I choose to implement the common parts. It won't be too bad to miss some cases since we DIDN'T do any such checks in the past 3 years. Any new check is an improvement. Given modules have been basically available since clang15 without such checks, it will be user unfriendly if we give a hard error now. And there are a lot of cases which violating the rule actually just fine. So I decide to emit it as warnings instead of hard errors.
cjappl
pushed a commit
that referenced
this pull request
Aug 14, 2025
`clang-repl --cuda` was previously crashing with a segmentation fault, instead of reporting a clean error ``` (base) anutosh491@Anutoshs-MacBook-Air bin % ./clang-repl --cuda #0 0x0000000111da4fbc llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x150fbc) #1 0x0000000111da31dc llvm::sys::RunSignalHandlers() (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x14f1dc) #2 0x0000000111da5628 SignalHandler(int) (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x151628) #3 0x000000019b242de4 (/usr/lib/system/libsystem_platform.dylib+0x180482de4) #4 0x0000000107f638d0 clang::IncrementalCUDADeviceParser::IncrementalCUDADeviceParser(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, clang::CompilerInstance&, llvm::IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem>, llvm::Error&, std::__1::list<clang::PartialTranslationUnit, std::__1::allocator<clang::PartialTranslationUnit>> const&) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x216b8d0) #5 0x0000000107f638d0 clang::IncrementalCUDADeviceParser::IncrementalCUDADeviceParser(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, clang::CompilerInstance&, llvm::IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem>, llvm::Error&, std::__1::list<clang::PartialTranslationUnit, std::__1::allocator<clang::PartialTranslationUnit>> const&) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x216b8d0) #6 0x0000000107f6bac8 clang::Interpreter::createWithCUDA(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x2173ac8) #7 0x000000010206f8a8 main (/opt/local/libexec/llvm-20/bin/clang-repl+0x1000038a8) #8 0x000000019ae8c274 Segmentation fault: 11 ``` The underlying issue was that the `DeviceCompilerInstance` (used for device-side CUDA compilation) was never initialized with a `Sema`, which is required before constructing the `IncrementalCUDADeviceParser`. https://github.com/llvm/llvm-project/blob/89687e6f383b742a3c6542dc673a84d9f82d02de/clang/lib/Interpreter/DeviceOffload.cpp#L32 https://github.com/llvm/llvm-project/blob/89687e6f383b742a3c6542dc673a84d9f82d02de/clang/lib/Interpreter/IncrementalParser.cpp#L31 Unlike the host-side `CompilerInstance` which runs `ExecuteAction` inside the Interpreter constructor (thereby setting up Sema), the device-side CI was passed into the parser uninitialized, leading to an assertion or crash when accessing its internals. To fix this, I refactored the `Interpreter::create` method to include an optional `DeviceCI` parameter. If provided, we know we need to take care of this instance too. Only then do we construct the `IncrementalCUDADeviceParser`. (cherry picked from commit 21fb19f)
cjappl
pushed a commit
that referenced
this pull request
Aug 14, 2025
llvm#138091) Check this error for more context (https://github.com/compiler-research/CppInterOp/actions/runs/14749797085/job/41407625681?pr=491#step:10:531) This fails with ``` * thread #1, name = 'CppInterOpTests', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x55500356d6d3) * frame #0: 0x00007fffee41cfe3 libclangCppInterOp.so.21.0gitclang::PragmaNamespace::~PragmaNamespace() + 99 frame #1: 0x00007fffee435666 libclangCppInterOp.so.21.0gitclang::Preprocessor::~Preprocessor() + 3830 frame #2: 0x00007fffee20917a libclangCppInterOp.so.21.0gitstd::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 58 frame #3: 0x00007fffee224796 libclangCppInterOp.so.21.0gitclang::CompilerInstance::~CompilerInstance() + 838 frame #4: 0x00007fffee22494d libclangCppInterOp.so.21.0gitclang::CompilerInstance::~CompilerInstance() + 13 frame #5: 0x00007fffed95ec62 libclangCppInterOp.so.21.0gitclang::IncrementalCUDADeviceParser::~IncrementalCUDADeviceParser() + 98 frame #6: 0x00007fffed9551b6 libclangCppInterOp.so.21.0gitclang::Interpreter::~Interpreter() + 102 frame #7: 0x00007fffed95598d libclangCppInterOp.so.21.0gitclang::Interpreter::~Interpreter() + 13 frame #8: 0x00007fffed9181e7 libclangCppInterOp.so.21.0gitcompat::createClangInterpreter(std::vector<char const*, std::allocator<char const*>>&) + 2919 ``` Problem : 1) The destructor currently handles no clearance for the DeviceParser and the DeviceAct. We currently only have this https://github.com/llvm/llvm-project/blob/976493822443c52a71ed3c67aaca9a555b20c55d/clang/lib/Interpreter/Interpreter.cpp#L416-L419 2) The ownership for DeviceCI currently is present in IncrementalCudaDeviceParser. But this should be similar to how the combination for hostCI, hostAction and hostParser are managed by the Interpreter. As on master the DeviceAct and DeviceParser are managed by the Interpreter but not DeviceCI. This is problematic because : IncrementalParser holds a Sema& which points into the DeviceCI. On master, DeviceCI is destroyed before the base class ~IncrementalParser() runs, causing Parser::reset() to access a dangling Sema (and as Sema holds a reference to Preprocessor which owns PragmaNamespace) we see this ``` * frame #0: 0x00007fffee41cfe3 libclangCppInterOp.so.21.0gitclang::PragmaNamespace::~PragmaNamespace() + 99 frame #1: 0x00007fffee435666 libclangCppInterOp.so.21.0gitclang::Preprocessor::~Preprocessor() + 3830 ``` (cherry picked from commit 529b6fc)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This introduces a nice self contained piece
Unit tests for this RUN which I think is what makes this a great chunk.