forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 339
[DON'T MERGE] Test pull request #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@swift-ci test |
@swift-ci please test (checking if lit tests are fixed now) |
@swift-ci please test |
adrian-prantl
pushed a commit
to adrian-prantl/llvm-project
that referenced
this pull request
Oct 29, 2019
This fixes a failing testcase on Fedora 30 x86_64 (regression Fedora 29->30): PASS: ./bin/lldb ./lldb-test-build.noindex/functionalities/unwind/noreturn/TestNoreturnUnwind.test_dwarf/a.out -o 'settings set symbols.enable-external-lookup false' -o r -o bt -o quit * frame #0: 0x00007ffff7aa6e75 libc.so.6`__GI_raise + 325 frame swiftlang#1: 0x00007ffff7a91895 libc.so.6`__GI_abort + 295 frame swiftlang#2: 0x0000000000401140 a.out`func_c at main.c:12:2 frame swiftlang#3: 0x000000000040113a a.out`func_b at main.c:18:2 frame swiftlang#4: 0x0000000000401134 a.out`func_a at main.c:26:2 frame swiftlang#5: 0x000000000040112e a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:32:2 frame swiftlang#6: 0x00007ffff7a92f33 libc.so.6`__libc_start_main + 243 frame swiftlang#7: 0x000000000040106e a.out`_start + 46 vs. FAIL - unrecognized abort() function: ./bin/lldb ./lldb-test-build.noindex/functionalities/unwind/noreturn/TestNoreturnUnwind.test_dwarf/a.out -o 'settings set symbols.enable-external-lookup false' -o r -o bt -o quit * frame #0: 0x00007ffff7aa6e75 libc.so.6`.annobin_raise.c + 325 frame swiftlang#1: 0x00007ffff7a91895 libc.so.6`.annobin_loadmsgcat.c_end.unlikely + 295 frame swiftlang#2: 0x0000000000401140 a.out`func_c at main.c:12:2 frame swiftlang#3: 0x000000000040113a a.out`func_b at main.c:18:2 frame swiftlang#4: 0x0000000000401134 a.out`func_a at main.c:26:2 frame swiftlang#5: 0x000000000040112e a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:32:2 frame swiftlang#6: 0x00007ffff7a92f33 libc.so.6`.annobin_libc_start.c + 243 frame swiftlang#7: 0x000000000040106e a.out`.annobin_init.c.hot + 46 The extra ELF symbols are there due to Annobin (I did not investigate why this problem happened specifically since F-30 and not since F-28). It is due to: Symbol table '.dynsym' contains 2361 entries: Valu e Size Type Bind Vis Name 0000000000022769 5 FUNC LOCAL DEFAULT _nl_load_domain.cold 000000000002276e 0 NOTYPE LOCAL HIDDEN .annobin_abort.c.unlikely ... 000000000002276e 0 NOTYPE LOCAL HIDDEN .annobin_loadmsgcat.c_end.unlikely ... 000000000002276e 0 NOTYPE LOCAL HIDDEN .annobin_textdomain.c_end.unlikely 000000000002276e 548 FUNC GLOBAL DEFAULT abort 000000000002276e 548 FUNC GLOBAL DEFAULT abort@@GLIBC_2.2.5 000000000002276e 548 FUNC LOCAL DEFAULT __GI_abort 0000000000022992 0 NOTYPE LOCAL HIDDEN .annobin_abort.c_end.unlikely Differential Revision: https://reviews.llvm.org/D63540 llvm-svn: 364773 (cherry picked from commit 3f594ed)
@swift-ci please test |
swift-ci
pushed a commit
that referenced
this pull request
Nov 4, 2019
Currently, clang emits subprograms for declared functions when the target debugger or DWARF standard is known to support entry values (DW_OP_entry_value & the GNU equivalent). Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU tail calls. Pre-patch debug session with a cross-TU tail call: ``` * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt] frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt] ``` Post-patch (note that the tail-calling frame, "helper", is visible): ``` * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt] frame #1: 0x0000000100000f80 main`helper [opt] [artificial] frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt] ``` rdar://46577651 Differential Revision: https://reviews.llvm.org/D69743
@swift-ci test Linux |
4 similar comments
@swift-ci test Linux |
@swift-ci test Linux |
@swift-ci test Linux |
@swift-ci test Linux |
@hyp Do you know if this is known failure on Linux?
|
swift-ci
pushed a commit
that referenced
this pull request
Nov 7, 2019
…_call is understood" This caused Chromium builds to fail with "inlinable function call in a function with debug info must have a !dbg location" errors. See https://bugs.chromium.org/p/chromium/issues/detail?id=1022296#c1 for a reproducer. > Currently, clang emits subprograms for declared functions when the > target debugger or DWARF standard is known to support entry values > (DW_OP_entry_value & the GNU equivalent). > > Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU > tail calls. > > Pre-patch debug session with a cross-TU tail call: > > ``` > * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt] > frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt] > ``` > > Post-patch (note that the tail-calling frame, "helper", is visible): > > ``` > * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt] > frame #1: 0x0000000100000f80 main`helper [opt] [artificial] > frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt] > ``` > > rdar://46577651 > > Differential Revision: https://reviews.llvm.org/D69743
swift-ci
pushed a commit
that referenced
this pull request
Nov 13, 2019
During register coalescing, we update the live-intervals on-the-fly. To do that we are in this strange mode where the live-intervals can be slightly out-of-sync (more precisely they are forward looking) compared to what the IR actually represents. This happens because the register coalescer only updates the IR when it is done with updating the live-intervals and it has to do it this way because updating the IR on-the-fly would actually clobber some information on how the live-ranges that are being updated look like. This is problematic for updates that rely on the IR to accurately represents the state of the live-ranges. Right now, we have only one of those: stripValuesNotDefiningMask. To reconcile this need of out-of-sync IR, this patch introduces a new argument to LiveInterval::refineSubRanges that allows the code doing the live range updates to reason about how the code should look like after the coalescer will have rewritten the registers. Essentially this captures how a subregister index with be offseted to match its position in a new register class. E.g., let say we want to merge: V1.sub1:<2 x s32> = COPY V2.sub3:<4 x s32> We do that by choosing a class where sub1:<2 x s32> and sub3:<4 x s32> overlap, i.e., by choosing a class where we can find "offset + 1 == 3". Put differently we align V2's sub3 with V1's sub1: V2: sub0 sub1 sub2 sub3 V1: <offset> sub0 sub1 This offset will look like a composed subregidx in the the class: V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32> => V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32> Now if we didn't rewrite the uses and def of V1, all the checks for V1 need to account for this offset to match what the live intervals intend to capture. Prior to this patch, we would fail to recognize the uses and def of V1 and would end up with machine verifier errors: No live segment at def. This could lead to miscompile as we would drop some live-ranges and thus, miss some interferences. For this problem to trigger, we need to reach stripValuesNotDefiningMask while having a mismatch between the IR and the live-ranges (i.e., we have to apply a subreg offset to the IR.) This requires the following three conditions: 1. An update of overlapping subreg lanes: e.g., dsub0 == <ssub0, ssub1> 2. An update with Tuple registers with a possibility to coalesce the subreg index: e.g., v1.dsub_1 == v2.dsub_3 3. Subreg liveness enabled. looking at the IR to decide what is alive and what is not, i.e., calling stripValuesNotDefiningMask. coalescer maintains for the live-ranges information. None of the targets that currently use subreg liveness (i.e., the targets that fulfill #3, Hexagon, AMDGPU, PowerPC, and SystemZ IIRC) expose #1 and and #2, so this patch also artificial enables subreg liveness for ARM, so that a nice test case can be attached.
Cam we close this one? |
swift-ci
pushed a commit
that referenced
this pull request
Nov 19, 2019
…ood (reland with fixes) Currently, clang emits subprograms for declared functions when the target debugger or DWARF standard is known to support entry values (DW_OP_entry_value & the GNU equivalent). Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU tail calls. Pre-patch debug session with a cross-TU tail call: ``` * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt] frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt] ``` Post-patch (note that the tail-calling frame, "helper", is visible): ``` * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt] frame #1: 0x0000000100000f80 main`helper [opt] [artificial] frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt] ``` This was reverted in 5b9a072 because it attached declaration subprograms to inlinable builtin calls, which interacted badly with the MergeICmps pass. The fix is to not attach declarations to builtins. rdar://46577651 Differential Revision: https://reviews.llvm.org/D69743
swift-ci
pushed a commit
that referenced
this pull request
Nov 21, 2019
…iant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151" Summary: Revert "[DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151" This reverts commit 5f026b6. We're (tensorflow.org/xla team) seeing some misscompiles with the new change, only at -O3, with fast math disabled. I'm still trying to come up with a useful/small/external example, but for now, the following IR: ``` ; ModuleID = '__compute_module' source_filename = "__compute_module" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-grtev4-linux-gnu" @0 = private unnamed_addr constant [4 x i8] c"\DB\0F\C9@" @1 = private unnamed_addr constant [4 x i8] c"\00\00\00?" ; Function Attrs: uwtable define void @jit_wrapped_fun.31(i8* %retval, i8* noalias %run_options, i8** noalias %params, i8** noalias %buffer_table, i64* noalias %prof_counters) #0 { entry: %fusion.invar_address.dim.2 = alloca i64 %fusion.invar_address.dim.1 = alloca i64 %fusion.invar_address.dim.0 = alloca i64 %fusion.1.invar_address.dim.2 = alloca i64 %fusion.1.invar_address.dim.1 = alloca i64 %fusion.1.invar_address.dim.0 = alloca i64 %0 = getelementptr inbounds i8*, i8** %buffer_table, i64 1 %1 = load i8*, i8** %0, !invariant.load !0, !dereferenceable !1, !align !2 %parameter.3 = bitcast i8* %1 to [2 x [1 x [4 x float]]]* %2 = getelementptr inbounds i8*, i8** %buffer_table, i64 5 %3 = load i8*, i8** %2, !invariant.load !0, !dereferenceable !1, !align !2 %fusion.1 = bitcast i8* %3 to [2 x [1 x [4 x float]]]* store i64 0, i64* %fusion.1.invar_address.dim.0 br label %fusion.1.loop_header.dim.0 fusion.1.loop_header.dim.0: ; preds = %fusion.1.loop_exit.dim.1, %entry %fusion.1.indvar.dim.0 = load i64, i64* %fusion.1.invar_address.dim.0 %4 = icmp uge i64 %fusion.1.indvar.dim.0, 2 br i1 %4, label %fusion.1.loop_exit.dim.0, label %fusion.1.loop_body.dim.0 fusion.1.loop_body.dim.0: ; preds = %fusion.1.loop_header.dim.0 store i64 0, i64* %fusion.1.invar_address.dim.1 br label %fusion.1.loop_header.dim.1 fusion.1.loop_header.dim.1: ; preds = %fusion.1.loop_exit.dim.2, %fusion.1.loop_body.dim.0 %fusion.1.indvar.dim.1 = load i64, i64* %fusion.1.invar_address.dim.1 %5 = icmp uge i64 %fusion.1.indvar.dim.1, 1 br i1 %5, label %fusion.1.loop_exit.dim.1, label %fusion.1.loop_body.dim.1 fusion.1.loop_body.dim.1: ; preds = %fusion.1.loop_header.dim.1 store i64 0, i64* %fusion.1.invar_address.dim.2 br label %fusion.1.loop_header.dim.2 fusion.1.loop_header.dim.2: ; preds = %fusion.1.loop_body.dim.2, %fusion.1.loop_body.dim.1 %fusion.1.indvar.dim.2 = load i64, i64* %fusion.1.invar_address.dim.2 %6 = icmp uge i64 %fusion.1.indvar.dim.2, 4 br i1 %6, label %fusion.1.loop_exit.dim.2, label %fusion.1.loop_body.dim.2 fusion.1.loop_body.dim.2: ; preds = %fusion.1.loop_header.dim.2 %7 = load float, float* bitcast ([4 x i8]* @0 to float*) %8 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %parameter.3, i64 0, i64 %fusion.1.indvar.dim.0, i64 0, i64 %fusion.1.indvar.dim.2 %9 = load float, float* %8, !invariant.load !0, !noalias !3 %10 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %parameter.3, i64 0, i64 %fusion.1.indvar.dim.0, i64 0, i64 %fusion.1.indvar.dim.2 %11 = load float, float* %10, !invariant.load !0, !noalias !3 %12 = fmul float %9, %11 %13 = fmul float %7, %12 %14 = call float @llvm.log.f32(float %13) %15 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %fusion.1, i64 0, i64 %fusion.1.indvar.dim.0, i64 0, i64 %fusion.1.indvar.dim.2 store float %14, float* %15, !alias.scope !7, !noalias !8 %invar.inc2 = add nuw nsw i64 %fusion.1.indvar.dim.2, 1 store i64 %invar.inc2, i64* %fusion.1.invar_address.dim.2 br label %fusion.1.loop_header.dim.2 fusion.1.loop_exit.dim.2: ; preds = %fusion.1.loop_header.dim.2 %invar.inc1 = add nuw nsw i64 %fusion.1.indvar.dim.1, 1 store i64 %invar.inc1, i64* %fusion.1.invar_address.dim.1 br label %fusion.1.loop_header.dim.1 fusion.1.loop_exit.dim.1: ; preds = %fusion.1.loop_header.dim.1 %invar.inc = add nuw nsw i64 %fusion.1.indvar.dim.0, 1 store i64 %invar.inc, i64* %fusion.1.invar_address.dim.0 br label %fusion.1.loop_header.dim.0 fusion.1.loop_exit.dim.0: ; preds = %fusion.1.loop_header.dim.0 %16 = getelementptr inbounds i8*, i8** %buffer_table, i64 4 %17 = load i8*, i8** %16, !invariant.load !0, !dereferenceable !9, !align !2 %parameter.1 = bitcast i8* %17 to float* %18 = getelementptr inbounds i8*, i8** %buffer_table, i64 2 %19 = load i8*, i8** %18, !invariant.load !0, !dereferenceable !10, !align !2 %parameter.2 = bitcast i8* %19 to [3 x [1 x float]]* %20 = getelementptr inbounds i8*, i8** %buffer_table, i64 0 %21 = load i8*, i8** %20, !invariant.load !0, !dereferenceable !11, !align !2 %fusion = bitcast i8* %21 to [2 x [3 x [4 x float]]]* store i64 0, i64* %fusion.invar_address.dim.0 br label %fusion.loop_header.dim.0 fusion.loop_header.dim.0: ; preds = %fusion.loop_exit.dim.1, %fusion.1.loop_exit.dim.0 %fusion.indvar.dim.0 = load i64, i64* %fusion.invar_address.dim.0 %22 = icmp uge i64 %fusion.indvar.dim.0, 2 br i1 %22, label %fusion.loop_exit.dim.0, label %fusion.loop_body.dim.0 fusion.loop_body.dim.0: ; preds = %fusion.loop_header.dim.0 store i64 0, i64* %fusion.invar_address.dim.1 br label %fusion.loop_header.dim.1 fusion.loop_header.dim.1: ; preds = %fusion.loop_exit.dim.2, %fusion.loop_body.dim.0 %fusion.indvar.dim.1 = load i64, i64* %fusion.invar_address.dim.1 %23 = icmp uge i64 %fusion.indvar.dim.1, 3 br i1 %23, label %fusion.loop_exit.dim.1, label %fusion.loop_body.dim.1 fusion.loop_body.dim.1: ; preds = %fusion.loop_header.dim.1 store i64 0, i64* %fusion.invar_address.dim.2 br label %fusion.loop_header.dim.2 fusion.loop_header.dim.2: ; preds = %fusion.loop_body.dim.2, %fusion.loop_body.dim.1 %fusion.indvar.dim.2 = load i64, i64* %fusion.invar_address.dim.2 %24 = icmp uge i64 %fusion.indvar.dim.2, 4 br i1 %24, label %fusion.loop_exit.dim.2, label %fusion.loop_body.dim.2 fusion.loop_body.dim.2: ; preds = %fusion.loop_header.dim.2 %25 = mul nuw nsw i64 %fusion.indvar.dim.2, 1 %26 = add nuw nsw i64 0, %25 %27 = udiv i64 %26, 4 %28 = mul nuw nsw i64 %fusion.indvar.dim.0, 1 %29 = add nuw nsw i64 0, %28 %30 = udiv i64 %29, 2 %31 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %fusion.1, i64 0, i64 %29, i64 0, i64 %26 %32 = load float, float* %31, !alias.scope !7, !noalias !8 %33 = mul nuw nsw i64 %fusion.indvar.dim.1, 1 %34 = add nuw nsw i64 0, %33 %35 = udiv i64 %34, 3 %36 = load float, float* %parameter.1, !invariant.load !0, !noalias !3 %37 = getelementptr inbounds [3 x [1 x float]], [3 x [1 x float]]* %parameter.2, i64 0, i64 %34, i64 0 %38 = load float, float* %37, !invariant.load !0, !noalias !3 %39 = fsub float %36, %38 %40 = fmul float %39, %39 %41 = mul nuw nsw i64 %fusion.indvar.dim.2, 1 %42 = add nuw nsw i64 0, %41 %43 = udiv i64 %42, 4 %44 = mul nuw nsw i64 %fusion.indvar.dim.0, 1 %45 = add nuw nsw i64 0, %44 %46 = udiv i64 %45, 2 %47 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %parameter.3, i64 0, i64 %45, i64 0, i64 %42 %48 = load float, float* %47, !invariant.load !0, !noalias !3 %49 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %parameter.3, i64 0, i64 %45, i64 0, i64 %42 %50 = load float, float* %49, !invariant.load !0, !noalias !3 %51 = fmul float %48, %50 %52 = fdiv float %40, %51 %53 = fadd float %32, %52 %54 = fneg float %53 %55 = load float, float* bitcast ([4 x i8]* @1 to float*) %56 = fmul float %54, %55 %57 = getelementptr inbounds [2 x [3 x [4 x float]]], [2 x [3 x [4 x float]]]* %fusion, i64 0, i64 %fusion.indvar.dim.0, i64 %fusion.indvar.dim.1, i64 %fusion.indvar.dim.2 store float %56, float* %57, !alias.scope !8, !noalias !12 %invar.inc5 = add nuw nsw i64 %fusion.indvar.dim.2, 1 store i64 %invar.inc5, i64* %fusion.invar_address.dim.2 br label %fusion.loop_header.dim.2 fusion.loop_exit.dim.2: ; preds = %fusion.loop_header.dim.2 %invar.inc4 = add nuw nsw i64 %fusion.indvar.dim.1, 1 store i64 %invar.inc4, i64* %fusion.invar_address.dim.1 br label %fusion.loop_header.dim.1 fusion.loop_exit.dim.1: ; preds = %fusion.loop_header.dim.1 %invar.inc3 = add nuw nsw i64 %fusion.indvar.dim.0, 1 store i64 %invar.inc3, i64* %fusion.invar_address.dim.0 br label %fusion.loop_header.dim.0 fusion.loop_exit.dim.0: ; preds = %fusion.loop_header.dim.0 %58 = getelementptr inbounds i8*, i8** %buffer_table, i64 3 %59 = load i8*, i8** %58, !invariant.load !0, !dereferenceable !2, !align !2 %tuple.30 = bitcast i8* %59 to [1 x i8*]* %60 = bitcast [2 x [3 x [4 x float]]]* %fusion to i8* %61 = getelementptr inbounds [1 x i8*], [1 x i8*]* %tuple.30, i64 0, i64 0 store i8* %60, i8** %61, !alias.scope !14, !noalias !8 ret void } ; Function Attrs: nounwind readnone speculatable willreturn declare float @llvm.log.f32(float) #1 attributes #0 = { uwtable "no-frame-pointer-elim"="false" } attributes #1 = { nounwind readnone speculatable willreturn } !0 = !{} !1 = !{i64 32} !2 = !{i64 8} !3 = !{!4, !6} !4 = !{!"buffer: {index:0, offset:0, size:96}", !5} !5 = !{!"XLA global AA domain"} !6 = !{!"buffer: {index:5, offset:0, size:32}", !5} !7 = !{!6} !8 = !{!4} !9 = !{i64 4} !10 = !{i64 12} !11 = !{i64 96} !12 = !{!13, !6} !13 = !{!"buffer: {index:3, offset:0, size:8}", !5} !14 = !{!13} ``` gets (correctly) optimized to the one below without the change: ``` ; ModuleID = '__compute_module' source_filename = "__compute_module" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-grtev4-linux-gnu" ; Function Attrs: nofree nounwind uwtable define void @jit_wrapped_fun.31(i8* nocapture readnone %retval, i8* noalias nocapture readnone %run_options, i8** noalias nocapture readnone %params, i8** noalias nocapture readonly %buffer_table, i64* noalias nocapture readnone %prof_counters) local_unnamed_addr #0 { entry: %0 = getelementptr inbounds i8*, i8** %buffer_table, i64 1 %1 = bitcast i8** %0 to [2 x [1 x [4 x float]]]** %2 = load [2 x [1 x [4 x float]]]*, [2 x [1 x [4 x float]]]** %1, align 8, !invariant.load !0, !dereferenceable !1, !align !2 %3 = getelementptr inbounds i8*, i8** %buffer_table, i64 5 %4 = bitcast i8** %3 to [2 x [1 x [4 x float]]]** %5 = load [2 x [1 x [4 x float]]]*, [2 x [1 x [4 x float]]]** %4, align 8, !invariant.load !0, !dereferenceable !1, !align !2 %6 = bitcast [2 x [1 x [4 x float]]]* %2 to <4 x float>* %7 = load <4 x float>, <4 x float>* %6, align 8, !invariant.load !0, !noalias !3 %8 = fmul <4 x float> %7, %7 %9 = fmul <4 x float> %8, <float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000> %10 = call <4 x float> @llvm.log.v4f32(<4 x float> %9) %11 = bitcast [2 x [1 x [4 x float]]]* %5 to <4 x float>* store <4 x float> %10, <4 x float>* %11, align 8, !alias.scope !7, !noalias !8 %12 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 1, i64 0, i64 0 %13 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 1, i64 0, i64 0 %14 = bitcast float* %12 to <4 x float>* %15 = load <4 x float>, <4 x float>* %14, align 8, !invariant.load !0, !noalias !3 %16 = fmul <4 x float> %15, %15 %17 = fmul <4 x float> %16, <float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000> %18 = call <4 x float> @llvm.log.v4f32(<4 x float> %17) %19 = bitcast float* %13 to <4 x float>* store <4 x float> %18, <4 x float>* %19, align 8, !alias.scope !7, !noalias !8 %20 = getelementptr inbounds i8*, i8** %buffer_table, i64 4 %21 = bitcast i8** %20 to float** %22 = load float*, float** %21, align 8, !invariant.load !0, !dereferenceable !9, !align !2 %23 = getelementptr inbounds i8*, i8** %buffer_table, i64 2 %24 = bitcast i8** %23 to [3 x [1 x float]]** %25 = load [3 x [1 x float]]*, [3 x [1 x float]]** %24, align 8, !invariant.load !0, !dereferenceable !10, !align !2 %26 = load i8*, i8** %buffer_table, align 8, !invariant.load !0, !dereferenceable !11, !align !2 %27 = load float, float* %22, align 8, !invariant.load !0, !noalias !3 %.phi.trans.insert28 = getelementptr inbounds [3 x [1 x float]], [3 x [1 x float]]* %25, i64 0, i64 2, i64 0 %.pre29 = load float, float* %.phi.trans.insert28, align 8, !invariant.load !0, !noalias !3 %28 = bitcast [3 x [1 x float]]* %25 to <2 x float>* %29 = load <2 x float>, <2 x float>* %28, align 8, !invariant.load !0, !noalias !3 %30 = insertelement <2 x float> undef, float %27, i32 0 %31 = shufflevector <2 x float> %30, <2 x float> undef, <2 x i32> zeroinitializer %32 = fsub <2 x float> %31, %29 %33 = fmul <2 x float> %32, %32 %shuffle30 = shufflevector <2 x float> %33, <2 x float> undef, <8 x i32> <i32 0, i32 0, i32 0, i32 0, i32 1, i32 1, i32 1, i32 1> %34 = fsub float %27, %.pre29 %35 = fmul float %34, %34 %36 = insertelement <4 x float> undef, float %35, i32 0 %37 = shufflevector <4 x float> %36, <4 x float> undef, <4 x i32> zeroinitializer %shuffle = shufflevector <4 x float> %10, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %38 = fmul <4 x float> %7, %7 %shuffle31 = shufflevector <4 x float> %38, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %39 = fdiv <8 x float> %shuffle30, %shuffle31 %40 = fadd <8 x float> %shuffle, %39 %41 = fmul <8 x float> %40, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01> %42 = bitcast i8* %26 to <8 x float>* store <8 x float> %41, <8 x float>* %42, align 8, !alias.scope !8, !noalias !12 %43 = getelementptr inbounds i8, i8* %26, i64 32 %44 = fdiv <4 x float> %37, %38 %45 = fadd <4 x float> %10, %44 %46 = fmul <4 x float> %45, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01> %47 = bitcast i8* %43 to <4 x float>* store <4 x float> %46, <4 x float>* %47, align 8, !alias.scope !8, !noalias !12 %.phi.trans.insert = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 1, i64 0, i64 0 %.phi.trans.insert12 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 1, i64 0, i64 0 %48 = bitcast float* %.phi.trans.insert to <4 x float>* %49 = load <4 x float>, <4 x float>* %48, align 8, !alias.scope !7, !noalias !8 %50 = bitcast float* %.phi.trans.insert12 to <4 x float>* %51 = load <4 x float>, <4 x float>* %50, align 8, !invariant.load !0, !noalias !3 %shuffle.1 = shufflevector <4 x float> %49, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %52 = getelementptr inbounds i8, i8* %26, i64 48 %53 = fmul <4 x float> %51, %51 %shuffle31.1 = shufflevector <4 x float> %53, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %54 = fdiv <8 x float> %shuffle30, %shuffle31.1 %55 = fadd <8 x float> %shuffle.1, %54 %56 = fmul <8 x float> %55, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01> %57 = bitcast i8* %52 to <8 x float>* store <8 x float> %56, <8 x float>* %57, align 8, !alias.scope !8, !noalias !12 %58 = getelementptr inbounds i8, i8* %26, i64 80 %59 = fdiv <4 x float> %37, %53 %60 = fadd <4 x float> %49, %59 %61 = fmul <4 x float> %60, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01> %62 = bitcast i8* %58 to <4 x float>* store <4 x float> %61, <4 x float>* %62, align 8, !alias.scope !8, !noalias !12 %63 = getelementptr inbounds i8*, i8** %buffer_table, i64 3 %64 = bitcast i8** %63 to [1 x i8*]** %65 = load [1 x i8*]*, [1 x i8*]** %64, align 8, !invariant.load !0, !dereferenceable !2, !align !2 %66 = getelementptr inbounds [1 x i8*], [1 x i8*]* %65, i64 0, i64 0 store i8* %26, i8** %66, align 8, !alias.scope !14, !noalias !8 ret void } ; Function Attrs: nounwind readnone speculatable willreturn declare <4 x float> @llvm.log.v4f32(<4 x float>) #1 attributes #0 = { nofree nounwind uwtable "no-frame-pointer-elim"="false" } attributes #1 = { nounwind readnone speculatable willreturn } !0 = !{} !1 = !{i64 32} !2 = !{i64 8} !3 = !{!4, !6} !4 = !{!"buffer: {index:0, offset:0, size:96}", !5} !5 = !{!"XLA global AA domain"} !6 = !{!"buffer: {index:5, offset:0, size:32}", !5} !7 = !{!6} !8 = !{!4} !9 = !{i64 4} !10 = !{i64 12} !11 = !{i64 96} !12 = !{!13, !6} !13 = !{!"buffer: {index:3, offset:0, size:8}", !5} !14 = !{!13} ``` and (incorrectly) optimized to the one below with the change: ``` ; ModuleID = '__compute_module' source_filename = "__compute_module" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-grtev4-linux-gnu" ; Function Attrs: nofree nounwind uwtable define void @jit_wrapped_fun.31(i8* nocapture readnone %retval, i8* noalias nocapture readnone %run_options, i8** noalias nocapture readnone %params, i8** noalias nocapture readonly %buffer_table, i64* noalias nocapture readnone %prof_counters) local_unnamed_addr #0 { entry: %0 = getelementptr inbounds i8*, i8** %buffer_table, i64 1 %1 = bitcast i8** %0 to [2 x [1 x [4 x float]]]** %2 = load [2 x [1 x [4 x float]]]*, [2 x [1 x [4 x float]]]** %1, align 8, !invariant.load !0, !dereferenceable !1, !align !2 %3 = getelementptr inbounds i8*, i8** %buffer_table, i64 5 %4 = bitcast i8** %3 to [2 x [1 x [4 x float]]]** %5 = load [2 x [1 x [4 x float]]]*, [2 x [1 x [4 x float]]]** %4, align 8, !invariant.load !0, !dereferenceable !1, !align !2 %6 = bitcast [2 x [1 x [4 x float]]]* %2 to <4 x float>* %7 = load <4 x float>, <4 x float>* %6, align 8, !invariant.load !0, !noalias !3 %8 = fmul <4 x float> %7, %7 %9 = fmul <4 x float> %8, <float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000> %10 = call <4 x float> @llvm.log.v4f32(<4 x float> %9) %11 = bitcast [2 x [1 x [4 x float]]]* %5 to <4 x float>* store <4 x float> %10, <4 x float>* %11, align 8, !alias.scope !7, !noalias !8 %12 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 1, i64 0, i64 0 %13 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 1, i64 0, i64 0 %14 = bitcast float* %12 to <4 x float>* %15 = load <4 x float>, <4 x float>* %14, align 8, !invariant.load !0, !noalias !3 %16 = fmul <4 x float> %15, %15 %17 = fmul <4 x float> %16, <float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000> %18 = call <4 x float> @llvm.log.v4f32(<4 x float> %17) %19 = bitcast float* %13 to <4 x float>* store <4 x float> %18, <4 x float>* %19, align 8, !alias.scope !7, !noalias !8 %20 = getelementptr inbounds i8*, i8** %buffer_table, i64 4 %21 = bitcast i8** %20 to float** %22 = load float*, float** %21, align 8, !invariant.load !0, !dereferenceable !9, !align !2 %23 = getelementptr inbounds i8*, i8** %buffer_table, i64 2 %24 = bitcast i8** %23 to [3 x [1 x float]]** %25 = load [3 x [1 x float]]*, [3 x [1 x float]]** %24, align 8, !invariant.load !0, !dereferenceable !10, !align !2 %26 = load i8*, i8** %buffer_table, align 8, !invariant.load !0, !dereferenceable !11, !align !2 %27 = load float, float* %22, align 8, !invariant.load !0, !noalias !3 %.phi.trans.insert28 = getelementptr inbounds [3 x [1 x float]], [3 x [1 x float]]* %25, i64 0, i64 2, i64 0 %.pre29 = load float, float* %.phi.trans.insert28, align 8, !invariant.load !0, !noalias !3 %28 = bitcast [3 x [1 x float]]* %25 to <2 x float>* %29 = load <2 x float>, <2 x float>* %28, align 8, !invariant.load !0, !noalias !3 %30 = insertelement <2 x float> undef, float %27, i32 0 %31 = shufflevector <2 x float> %30, <2 x float> undef, <2 x i32> zeroinitializer %32 = fsub <2 x float> %31, %29 %33 = fmul <2 x float> %32, %32 %shuffle32 = shufflevector <2 x float> %33, <2 x float> undef, <8 x i32> <i32 0, i32 0, i32 0, i32 0, i32 1, i32 1, i32 1, i32 1> %34 = fsub float %27, %.pre29 %35 = fmul float %34, %34 %36 = insertelement <4 x float> undef, float %35, i32 0 %37 = shufflevector <4 x float> %36, <4 x float> undef, <4 x i32> zeroinitializer %shuffle = shufflevector <4 x float> %10, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %38 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 0, i64 0, i64 3 %39 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 0, i64 0, i64 3 %40 = fmul <4 x float> %7, %7 %41 = shufflevector <4 x float> %40, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 undef, i32 undef, i32 undef, i32 undef> %42 = fdiv <8 x float> %shuffle32, %41 %43 = fadd <8 x float> %shuffle, %42 %44 = fmul <8 x float> %43, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01> %45 = bitcast i8* %26 to <8 x float>* store <8 x float> %44, <8 x float>* %45, align 8, !alias.scope !8, !noalias !12 %46 = extractelement <4 x float> %10, i32 0 %47 = getelementptr inbounds i8, i8* %26, i64 32 %48 = extractelement <4 x float> %10, i32 1 %49 = extractelement <4 x float> %10, i32 2 %50 = load float, float* %38, align 4, !alias.scope !7, !noalias !8 %51 = load float, float* %39, align 4, !invariant.load !0, !noalias !3 %52 = fmul float %51, %51 %53 = insertelement <4 x float> undef, float %52, i32 3 %54 = fdiv <4 x float> %37, %53 %55 = insertelement <4 x float> undef, float %46, i32 0 %56 = insertelement <4 x float> %55, float %48, i32 1 %57 = insertelement <4 x float> %56, float %49, i32 2 %58 = insertelement <4 x float> %57, float %50, i32 3 %59 = fadd <4 x float> %58, %54 %60 = fmul <4 x float> %59, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01> %61 = bitcast i8* %47 to <4 x float>* store <4 x float> %60, <4 x float>* %61, align 8, !alias.scope !8, !noalias !12 %.phi.trans.insert = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 1, i64 0, i64 0 %.phi.trans.insert12 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 1, i64 0, i64 0 %62 = bitcast float* %.phi.trans.insert to <4 x float>* %63 = load <4 x float>, <4 x float>* %62, align 8, !alias.scope !7, !noalias !8 %64 = bitcast float* %.phi.trans.insert12 to <4 x float>* %65 = load <4 x float>, <4 x float>* %64, align 8, !invariant.load !0, !noalias !3 %shuffle.1 = shufflevector <4 x float> %63, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %66 = getelementptr inbounds i8, i8* %26, i64 48 %67 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 1, i64 0, i64 3 %68 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 1, i64 0, i64 3 %69 = fmul <4 x float> %65, %65 %70 = shufflevector <4 x float> %69, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %71 = fdiv <8 x float> %shuffle32, %70 %72 = fadd <8 x float> %shuffle.1, %71 %73 = fmul <8 x float> %72, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01> %74 = bitcast i8* %66 to <8 x float>* store <8 x float> %73, <8 x float>* %74, align 8, !alias.scope !8, !noalias !12 %75 = extractelement <4 x float> %69, i32 0 %76 = extractelement <4 x float> %63, i32 0 %77 = getelementptr inbounds i8, i8* %26, i64 80 %78 = extractelement <4 x float> %69, i32 1 %79 = extractelement <4 x float> %63, i32 1 %80 = extractelement <4 x float> %69, i32 2 %81 = extractelement <4 x float> %63, i32 2 %82 = load float, float* %67, align 4, !alias.scope !7, !noalias !8 %83 = load float, float* %68, align 4, !invariant.load !0, !noalias !3 %84 = fmul float %83, %83 %85 = insertelement <4 x float> undef, float %75, i32 0 %86 = insertelement <4 x float> %85, float %78, i32 1 %87 = insertelement <4 x float> %86, float %80, i32 2 %88 = insertelement <4 x float> %87, float %84, i32 3 %89 = fdiv <4 x float> %37, %88 %90 = insertelement <4 x float> undef, float %76, i32 0 %91 = insertelement <4 x float> %90, float %79, i32 1 %92 = insertelement <4 x float> %91, float %81, i32 2 %93 = insertelement <4 x float> %92, float %82, i32 3 %94 = fadd <4 x float> %93, %89 %95 = fmul <4 x float> %94, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01> %96 = bitcast i8* %77 to <4 x float>* store <4 x float> %95, <4 x float>* %96, align 8, !alias.scope !8, !noalias !12 %97 = getelementptr inbounds i8*, i8** %buffer_table, i64 3 %98 = bitcast i8** %97 to [1 x i8*]** %99 = load [1 x i8*]*, [1 x i8*]** %98, align 8, !invariant.load !0, !dereferenceable !2, !align !2 %100 = getelementptr inbounds [1 x i8*], [1 x i8*]* %99, i64 0, i64 0 store i8* %26, i8** %100, align 8, !alias.scope !14, !noalias !8 ret void } ; Function Attrs: nounwind readnone speculatable willreturn declare <4 x float> @llvm.log.v4f32(<4 x float>) #1 attributes #0 = { nofree nounwind uwtable "no-frame-pointer-elim"="false" } attributes #1 = { nounwind readnone speculatable willreturn } !0 = !{} !1 = !{i64 32} !2 = !{i64 8} !3 = !{!4, !6} !4 = !{!"buffer: {index:0, offset:0, size:96}", !5} !5 = !{!"XLA global AA domain"} !6 = !{!"buffer: {index:5, offset:0, size:32}", !5} !7 = !{!6} !8 = !{!4} !9 = !{i64 4} !10 = !{i64 12} !11 = !{i64 96} !12 = !{!13, !6} !13 = !{!"buffer: {index:3, offset:0, size:8}", !5} !14 = !{!13} ``` This results in bad numerical answers when used through XLA. Again, it's not that easy to give a small fully-reproducible example, but the misscompare is: ``` Expected literal: ( f32[2,3,4] { { { nan, -inf, -3181.35, -inf }, { nan, -inf, -28.2577019, -inf }, { nan, -inf, -28.2577019, -inf } }, { { -inf, -inf, -inf, -inf }, { -6.60753046e+28, -1.47314833e+23, -inf, -inf }, { -2.43504347e+30, -5.42892693e+24, -inf, -inf } } } ) Actual literal: ( f32[2,3,4] { { { nan, -inf, -3181.35, -inf }, { nan, -inf, -inf, -inf }, { inf, -inf, -28.2577019, -inf } }, { { -inf, -inf, -inf, -inf }, { -6.60753046e+28, -1.47314833e+23, -inf, -inf }, { -2.43504347e+30, -5.42892693e+24, -inf, -inf } } } ) ``` Reviewers: sanjoy.google, sanjoy, ebrevnov, jdoerfert, reames, chandlerc Subscribers: hiraditya, Charusso, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70516
swift-ci
pushed a commit
that referenced
this pull request
Dec 3, 2019
This patch adds the following intrinsics for gather loads with 64-bit offsets: * @llvm.aarch64.sve.ld1.gather (unscaled offset) * @llvm.aarch64.sve.ld1.gather.index (scaled offset) These intrinsics map 1-1 to the following AArch64 instructions respectively (examples for half-words): * ld1h { z0.d }, p0/z, [x0, z0.d] * ld1h { z0.d }, p0/z, [x0, z0.d, lsl #1] Committing on behalf of Andrzej Warzynski (andwar) Reviewers: sdesmalen, huntergr, rovka, mgudim, dancgr, rengolin, efriedma Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D70542
swift-ci
pushed a commit
that referenced
this pull request
Dec 3, 2019
This patch adds intrinsics for SVE gather loads for which the offsets are 32-bits wide and are: * unscaled * @llvm.aarch64.sve.ld1.gather.sxtw * @llvm.aarch64.sve.ld1.gather.uxtw * scaled (offsets become indices) * @llvm.arch64.sve.ld1.gather.sxtw.index * @llvm.arch64.sve.ld1.gather.uxtw.index The offsets are either zero (uxtw) or sign (sxtw) extended to 64 bits. These intrinsics map 1-1 to the corresponding SVE instructions (examples for half-words): * unscaled * ld1h { z0.s }, p0/z, [x0, z0.s, sxtw] * ld1h { z0.s }, p0/z, [x0, z0.s, uxtw] * scaled * ld1h { z0.s }, p0/z, [x0, z0.s, sxtw #1] * ld1h { z0.s }, p0/z, [x0, z0.s, uxtw #1] Committed on behalf of Andrzej Warzynski (andwar) Reviewers: sdesmalen, kmclaughlin, eli.friedman, rengolin, rovka, huntergr, dancgr, mgudim, efriedma Reviewed By: sdesmalen Tags: #llvm Differential Revision: https://reviews.llvm.org/D70782
swift-ci
pushed a commit
that referenced
this pull request
Dec 18, 2019
Summary: This patch adds instructions to the InstCombine worklist after they are properly inserted. This way we don't get `<badref>`s printed when logging added instructions. It also adds a check in `Worklist::Add` that ensures that all added instructions have parents. Simple test case that illustrates the difference when run with `--debug-only=instcombine`: ``` define i32 @test35(i32 %a, i32 %b) { %1 = or i32 %a, 1135 %2 = or i32 %1, %b ret i32 %2 } ``` Before this patch: ``` INSTCOMBINE ITERATION #1 on test35 IC: ADDING: 3 instrs to worklist IC: Visiting: %1 = or i32 %a, 1135 IC: Visiting: %2 = or i32 %1, %b IC: ADD: %2 = or i32 %a, %b IC: Old = %3 = or i32 %1, %b New = <badref> = or i32 %2, 1135 IC: ADD: <badref> = or i32 %2, 1135 ... ``` With this patch: ``` INSTCOMBINE ITERATION #1 on test35 IC: ADDING: 3 instrs to worklist IC: Visiting: %1 = or i32 %a, 1135 IC: Visiting: %2 = or i32 %1, %b IC: ADD: %2 = or i32 %a, %b IC: Old = %3 = or i32 %1, %b New = <badref> = or i32 %2, 1135 IC: ADD: %3 = or i32 %2, 1135 ... ``` Reviewers: fhahn, davide, spatel, foad, grosser, nikic Reviewed By: nikic Subscribers: nikic, lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71093
swift-ci
pushed a commit
that referenced
this pull request
Jan 13, 2020
…t binding This fixes a failing testcase on Fedora 30 x86_64 (regression Fedora 29->30): PASS: ./bin/lldb ./lldb-test-build.noindex/functionalities/unwind/noreturn/TestNoreturnUnwind.test_dwarf/a.out -o 'settings set symbols.enable-external-lookup false' -o r -o bt -o quit * frame #0: 0x00007ffff7aa6e75 libc.so.6`__GI_raise + 325 frame #1: 0x00007ffff7a91895 libc.so.6`__GI_abort + 295 frame #2: 0x0000000000401140 a.out`func_c at main.c:12:2 frame #3: 0x000000000040113a a.out`func_b at main.c:18:2 frame #4: 0x0000000000401134 a.out`func_a at main.c:26:2 frame #5: 0x000000000040112e a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:32:2 frame #6: 0x00007ffff7a92f33 libc.so.6`__libc_start_main + 243 frame #7: 0x000000000040106e a.out`_start + 46 vs. FAIL - unrecognized abort() function: ./bin/lldb ./lldb-test-build.noindex/functionalities/unwind/noreturn/TestNoreturnUnwind.test_dwarf/a.out -o 'settings set symbols.enable-external-lookup false' -o r -o bt -o quit * frame #0: 0x00007ffff7aa6e75 libc.so.6`.annobin_raise.c + 325 frame #1: 0x00007ffff7a91895 libc.so.6`.annobin_loadmsgcat.c_end.unlikely + 295 frame #2: 0x0000000000401140 a.out`func_c at main.c:12:2 frame #3: 0x000000000040113a a.out`func_b at main.c:18:2 frame #4: 0x0000000000401134 a.out`func_a at main.c:26:2 frame #5: 0x000000000040112e a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:32:2 frame #6: 0x00007ffff7a92f33 libc.so.6`.annobin_libc_start.c + 243 frame #7: 0x000000000040106e a.out`.annobin_init.c.hot + 46 The extra ELF symbols are there due to Annobin (I did not investigate why this problem happened specifically since F-30 and not since F-28). It is due to: Symbol table '.dynsym' contains 2361 entries: Valu e Size Type Bind Vis Name 0000000000022769 5 FUNC LOCAL DEFAULT _nl_load_domain.cold 000000000002276e 0 NOTYPE LOCAL HIDDEN .annobin_abort.c.unlikely ... 000000000002276e 0 NOTYPE LOCAL HIDDEN .annobin_loadmsgcat.c_end.unlikely ... 000000000002276e 0 NOTYPE LOCAL HIDDEN .annobin_textdomain.c_end.unlikely 000000000002276e 548 FUNC GLOBAL DEFAULT abort 000000000002276e 548 FUNC GLOBAL DEFAULT abort@@GLIBC_2.2.5 000000000002276e 548 FUNC LOCAL DEFAULT __GI_abort 0000000000022992 0 NOTYPE LOCAL HIDDEN .annobin_abort.c_end.unlikely GDB has some more complicated preferences between overlapping and/or sharing address symbols, I have made here so far the most simple fix for this case. Differential revision: https://reviews.llvm.org/D63540
swift-ci
pushed a commit
that referenced
this pull request
Jan 15, 2020
TSan spuriously reports for any OpenMP application a race on the initialization of a runtime internal mutex: ``` Atomic read of size 1 at 0x7b6800005940 by thread T4: #0 pthread_mutex_lock <null> (a.out+0x43f39e) #1 __kmp_resume_64 <null> (libomp.so.5+0x84db4) Previous write of size 1 at 0x7b6800005940 by thread T7: #0 pthread_mutex_init <null> (a.out+0x424793) #1 __kmp_suspend_initialize_thread <null> (libomp.so.5+0x8422e) ``` According to @AndreyChurbanov this is a false positive report, as the control flow of the runtime guarantees the ordering of the mutex initialization and the lock: https://software.intel.com/en-us/forums/intel-open-source-openmp-runtime-library/topic/530363 To suppress this report, I suggest the use of TSAN_OPTIONS='ignore_uninstrumented_modules=1'. With this patch, a runtime warning is provided in case an OpenMP application is built with Tsan and executed without this Tsan-option. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D70412
swift-ci
pushed a commit
that referenced
this pull request
Jan 16, 2020
The test is currently failing on some systems with ASAN enabled due to: ``` ==22898==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000003da4 at pc 0x00010951c33d bp 0x7ffee6709e00 sp 0x7ffee67095c0 READ of size 5 at 0x603000003da4 thread T0 #0 0x10951c33c in wrap_memmove+0x16c (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x1833c) #1 0x7fff4a327f57 in CFDataReplaceBytes+0x1ba (CoreFoundation:x86_64+0x13f57) #2 0x7fff4a415a44 in __CFDataInit+0x2db (CoreFoundation:x86_64+0x101a44) #3 0x1094f8490 in main main.m:424 #4 0x7fff77482084 in start+0x0 (libdyld.dylib:x86_64+0x17084) 0x603000003da4 is located 0 bytes to the right of 20-byte region [0x603000003d90,0x603000003da4) allocated by thread T0 here: #0 0x109547c02 in wrap_calloc+0xa2 (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x43c02) #1 0x7fff763ad3ef in class_createInstance+0x52 (libobjc.A.dylib:x86_64+0x73ef) #2 0x7fff4c6b2d73 in NSAllocateObject+0x12 (Foundation:x86_64+0x1d73) #3 0x7fff4c6b5e5f in -[_NSPlaceholderData initWithBytes:length:copy:deallocator:]+0x40 (Foundation:x86_64+0x4e5f) #4 0x7fff4c6d4cf1 in -[NSData(NSData) initWithBytes:length:]+0x24 (Foundation:x86_64+0x23cf1) #5 0x1094f8245 in main main.m:404 #6 0x7fff77482084 in start+0x0 (libdyld.dylib:x86_64+0x17084) ``` The reason is that we create a string "HELLO" but get the size wrong (it's 5 bytes instead of 4). Later on we read the buffer and pretend it is 5 bytes long, causing an OOB read which ASAN detects. In general this test probably needs some cleanup as it produces on macOS 10.15 around 100 compiler warnings which isn't great, but let's first get the bot green.
felipepiovezan
added a commit
to felipepiovezan/llvm-project
that referenced
this pull request
Mar 14, 2025
Prior to this commit, line breakpoints that match multiple funclets were being filtered out to remove "Q" funclets from them. This generally works for simple "await foo()" expressions, but it is not a comprehensive solution, as it does not address the patterns emerging from `async let` and `await <async_let_variable>` statements. This commit generalizes the filtering algorithm: 1. Locations are bundled together based on the async function that generated them. 2. For each bundle, choose the funclet with the smallest "number" as per its mangling. To see why this is helpful, consider an `async let` statement like: `async let timestamp3 = getTimestamp(i: 44)` It creates 4 funclets: ``` 2.1: function = (3) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY2_ 2.2: function = implicit closure swiftlang#1 @sendable () async -> Swift.Int in test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFSiyYaYbcfu_ 2.3: function = (1) await resume partial function for implicit closure swiftlang#1 @sendable () async -> Swift.Int in test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFSiyYaYbcfu_TQ0_ 2.4: function = (2) suspend resume partial function for implicit closure swiftlang#1 @sendable () async -> Swift.Int in test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFSiyYaYbcfu_TY1_ ``` The first is for the LHS, the others are for the RHS expression and are only executed by the new Task. Only 2.1 and 2.2 should receive breakpoints. Likewise, a breakpoint on an `await <async_let_variable>` line would create 3 funclets: ``` 3.1: function = (3) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY2_ 3.2: function = (4) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY3_ 3.3: function = (5) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY4_ ``` The first is for "before" the await, the other two for "after". Only the first should receive a breakpoint.
felipepiovezan
added a commit
to felipepiovezan/llvm-project
that referenced
this pull request
Mar 14, 2025
Prior to this commit, line breakpoints that match multiple funclets were being filtered out to remove "Q" funclets from them. This generally works for simple "await foo()" expressions, but it is not a comprehensive solution, as it does not address the patterns emerging from `async let` and `await <async_let_variable>` statements. This commit generalizes the filtering algorithm: 1. Locations are bundled together based on the async function that generated them. 2. For each bundle, choose the funclet with the smallest "number" as per its mangling. To see why this is helpful, consider an `async let` statement like: `async let timestamp3 = getTimestamp(i: 44)` It creates 4 funclets: ``` 2.1: function = (3) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY2_ 2.2: function = implicit closure swiftlang#1 @sendable () async -> Swift.Int in test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFSiyYaYbcfu_ 2.3: function = (1) await resume partial function for implicit closure swiftlang#1 @sendable () async -> Swift.Int in test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFSiyYaYbcfu_TQ0_ 2.4: function = (2) suspend resume partial function for implicit closure swiftlang#1 @sendable () async -> Swift.Int in test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFSiyYaYbcfu_TY1_ ``` The first is for the LHS, the others are for the RHS expression and are only executed by the new Task. Only 2.1 and 2.2 should receive breakpoints. Likewise, a breakpoint on an `await <async_let_variable>` line would create 3 funclets: ``` 3.1: function = (3) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY2_ 3.2: function = (4) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY3_ 3.3: function = (5) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY4_ ``` The first is for "before" the await, the other two for "after". Only the first should receive a breakpoint.
felipepiovezan
added a commit
to felipepiovezan/llvm-project
that referenced
this pull request
Mar 15, 2025
Prior to this commit, line breakpoints that match multiple funclets were being filtered out to remove "Q" funclets from them. This generally works for simple "await foo()" expressions, but it is not a comprehensive solution, as it does not address the patterns emerging from `async let` and `await <async_let_variable>` statements. This commit generalizes the filtering algorithm: 1. Locations are bundled together based on the async function that generated them. 2. For each bundle, choose the funclet with the smallest "number" as per its mangling. To see why this is helpful, consider an `async let` statement like: `async let timestamp3 = getTimestamp(i: 44)` It creates 4 funclets: ``` 2.1: function = (3) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY2_ 2.2: function = implicit closure swiftlang#1 @sendable () async -> Swift.Int in test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFSiyYaYbcfu_ 2.3: function = (1) await resume partial function for implicit closure swiftlang#1 @sendable () async -> Swift.Int in test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFSiyYaYbcfu_TQ0_ 2.4: function = (2) suspend resume partial function for implicit closure swiftlang#1 @sendable () async -> Swift.Int in test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFSiyYaYbcfu_TY1_ ``` The first is for the LHS, the others are for the RHS expression and are only executed by the new Task. Only 2.1 and 2.2 should receive breakpoints. Likewise, a breakpoint on an `await <async_let_variable>` line would create 3 funclets: ``` 3.1: function = (3) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY2_ 3.2: function = (4) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY3_ 3.3: function = (5) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY4_ ``` The first is for "before" the await, the other two for "after". Only the first should receive a breakpoint.
felipepiovezan
added a commit
to felipepiovezan/llvm-project
that referenced
this pull request
Mar 16, 2025
Prior to this commit, line breakpoints that match multiple funclets were being filtered out to remove "Q" funclets from them. This generally works for simple "await foo()" expressions, but it is not a comprehensive solution, as it does not address the patterns emerging from `async let` and `await <async_let_variable>` statements. This commit generalizes the filtering algorithm: 1. Locations are bundled together based on the async function that generated them. 2. For each bundle, choose the funclet with the smallest "number" as per its mangling. To see why this is helpful, consider an `async let` statement like: `async let timestamp3 = getTimestamp(i: 44)` It creates 4 funclets: ``` 2.1: function = (3) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY2_ 2.2: function = implicit closure swiftlang#1 @sendable () async -> Swift.Int in test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFSiyYaYbcfu_ 2.3: function = (1) await resume partial function for implicit closure swiftlang#1 @sendable () async -> Swift.Int in test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFSiyYaYbcfu_TQ0_ 2.4: function = (2) suspend resume partial function for implicit closure swiftlang#1 @sendable () async -> Swift.Int in test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFSiyYaYbcfu_TY1_ ``` The first is for the LHS, the others are for the RHS expression and are only executed by the new Task. Only 2.1 and 2.2 should receive breakpoints. Likewise, a breakpoint on an `await <async_let_variable>` line would create 3 funclets: ``` 3.1: function = (3) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY2_ 3.2: function = (4) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY3_ 3.3: function = (5) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY4_ ``` The first is for "before" the await, the other two for "after". Only the first should receive a breakpoint.
felipepiovezan
added a commit
that referenced
this pull request
Mar 18, 2025
Prior to this commit, line breakpoints that match multiple funclets were being filtered out to remove "Q" funclets from them. This generally works for simple "await foo()" expressions, but it is not a comprehensive solution, as it does not address the patterns emerging from `async let` and `await <async_let_variable>` statements. This commit generalizes the filtering algorithm: 1. Locations are bundled together based on the async function that generated them. 2. For each bundle, choose the funclet with the smallest "number" as per its mangling. To see why this is helpful, consider an `async let` statement like: `async let timestamp3 = getTimestamp(i: 44)` It creates 4 funclets: ``` 2.1: function = (3) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY2_ 2.2: function = implicit closure #1 @sendable () async -> Swift.Int in test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFSiyYaYbcfu_ 2.3: function = (1) await resume partial function for implicit closure #1 @sendable () async -> Swift.Int in test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFSiyYaYbcfu_TQ0_ 2.4: function = (2) suspend resume partial function for implicit closure #1 @sendable () async -> Swift.Int in test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFSiyYaYbcfu_TY1_ ``` The first is for the LHS, the others are for the RHS expression and are only executed by the new Task. Only 2.1 and 2.2 should receive breakpoints. Likewise, a breakpoint on an `await <async_let_variable>` line would create 3 funclets: ``` 3.1: function = (3) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY2_ 3.2: function = (4) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY3_ 3.3: function = (5) suspend resume partial function for test.some_other_async() async -> () mangled function = $s4test16some_other_asyncyyYaFTY4_ ``` The first is for "before" the await, the other two for "after". Only the first should receive a breakpoint. (cherry picked from commit de7263a)
Michael137
added a commit
that referenced
this pull request
Mar 20, 2025
These are macOS tests only and are currently failing on the x86_64 CI and on arm64 on recent versions of macOS/Xcode. The tests are failing because we're stopping in: ``` Process 17458 stopped * thread #1: tid = 0xbda69a, 0x00000002735bd000 libsystem_malloc.dylib`purgeable_print_self.cold.1, stop reason = EXC_BREAKPOINT (code=1, subcode=0x2735bd000) ``` instead of the libsanitizers library. This seems to be related to `-fsanitize-trivial-abi` support Skip these for now until we figure out the root cause. (cherry picked from commit 6cc8b0b)
swift-ci
pushed a commit
that referenced
this pull request
Mar 21, 2025
… pointers (llvm#132261) Currently, the helpers to get fir::ExtendedValue out of hlfir::Entity use hlfir.declare second result (`#1`) in most cases. This is because this result is the same as the input and matches what FIR was getting before lowering to HLFIR. But this creates odd situations when both hlfir.declare are raw pointers and either result ends-up being used in the IR depending on whether the code was generated by a helper using fir::ExtendedValue, or via "pure HLFIR" helpers using the first result. This will typically prevent simple CSE and easy identification that two operation (e.g load/store) are touching the exact same memory location without using alias analysis or "manual detection" (looking for common hlfir.declare defining op). Hence, when hlfir.declare results are both raw pointers, use `#0` when producing `fir::ExtendedValue`. When `#0` is a fir.box, keep using `#1` because these are not the same. The only code change is in HLFIRTools.cpp and is pretty small, but there is a big test fallout of `#1` to `#0`.
swift-ci
pushed a commit
that referenced
this pull request
Mar 21, 2025
…too. (llvm#132267) Observed in Wine when trying to intercept `ExitThread`, which forwards to `ntdll.RtlExitUserThread`. `gdb` interprets it as `xchg %ax,%ax`. `llvm-mc` outputs simply `nop`. ``` ==Asan-i386-calls-Dynamic-Test.exe==964==interception_win: unhandled instruction at 0x7be27cf0: 66 90 55 89 e5 56 50 8b ``` ``` Wine-gdb> bt #0 0x789a1766 in __interception::GetInstructionSize (address=<optimized out>, rel_offset=<optimized out>) at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/interception/interception_win.cpp:983 #1 0x789ab480 in __sanitizer::SharedPrintfCode(bool, char const*, char*) () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_printf.cpp:311 #2 0x789a18e7 in __interception::OverrideFunctionWithHotPatch (old_func=2078440688, new_func=2023702608, orig_old_func=warning: (Internal error: pc 0x792f1a2c in read in CU, but not in symtab.)warning: (Error: pc 0x792f1a2c in address map, but not in symtab.)0x792f1a2c) at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/interception/interception_win.cpp:1118 #3 0x789a1f34 in __interception::OverrideFunction (old_func=2078440688, new_func=2023702608, orig_old_func=warning: (Internal error: pc 0x792f1a2c in read in CU, but not in symtab.)warning: (Error: pc 0x792f1a2c in address map, but not in symtab.)0x792f1a2c) at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/interception/interception_win.cpp:1224 #4 0x789a24ce in __interception::OverrideFunction (func_name=0x78a0bc43 <vtable for __asan::AsanThreadContext+1163> "ExitThread", new_func=2023702608, orig_old_func=warning: (Internal error: pc 0x792f1a2c in read in CU, but not in symtab.)warning: (Error: pc 0x792f1a2c in address map, but not in symtab.)0x792f1a2c) at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/interception/interception_win.cpp:1369 #5 0x789f40ef in __asan::InitializePlatformInterceptors () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_win.cpp:190 #6 0x789e0c3c in __asan::InitializeAsanInterceptors () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_interceptors.cpp:802 #7 0x789ee6b5 in __asan::AsanInitInternal () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_rtl.cpp:442 #8 0x789eefb0 in __asan::AsanInitFromRtl () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_rtl.cpp:522 #9 __asan::AsanInitializer::AsanInitializer (this=<optimized out>) at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_rtl.cpp:542 #10 __cxx_global_var_init () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_rtl.cpp:546 ... Wine-gdb> disassemble /r 2078440688,2078440688+20 Dump of assembler code from 0x7be27cf0 to 0x7be27d04: 0x7be27cf0 <_RtlExitUserThread@4+0>: 66 90 xchg %ax,%ax ... ```
Michael137
added a commit
that referenced
this pull request
Mar 24, 2025
These are macOS tests only and are currently failing on the x86_64 CI and on arm64 on recent versions of macOS/Xcode. The tests are failing because we're stopping in: ``` Process 17458 stopped * thread #1: tid = 0xbda69a, 0x00000002735bd000 libsystem_malloc.dylib`purgeable_print_self.cold.1, stop reason = EXC_BREAKPOINT (code=1, subcode=0x2735bd000) ``` instead of the libsanitizers library. This seems to be related to `-fsanitize-trivial-abi` support Skip these for now until we figure out the root cause. (cherry picked from commit 6cc8b0b)
Michael137
added a commit
that referenced
this pull request
Mar 24, 2025
These are macOS tests only and are currently failing on the x86_64 CI and on arm64 on recent versions of macOS/Xcode. The tests are failing because we're stopping in: ``` Process 17458 stopped * thread #1: tid = 0xbda69a, 0x00000002735bd000 libsystem_malloc.dylib`purgeable_print_self.cold.1, stop reason = EXC_BREAKPOINT (code=1, subcode=0x2735bd000) ``` instead of the libsanitizers library. This seems to be related to `-fsanitize-trivial-abi` support Skip these for now until we figure out the root cause. (cherry picked from commit 6cc8b0b)
swift-ci
pushed a commit
that referenced
this pull request
Apr 4, 2025
…d A520 (llvm#132246) Inefficient SVE codegen occurs on at least two in-order cores, those being Cortex-A510 and Cortex-A520. For example a simple vector add ``` void foo(float a, float b, float dst, unsigned n) { for (unsigned i = 0; i < n; ++i) dst[i] = a[i] + b[i]; } ``` Vectorizes the inner loop into the following interleaved sequence of instructions. ``` add x12, x1, x10 ld1b { z0.b }, p0/z, [x1, x10] add x13, x2, x10 ld1b { z1.b }, p0/z, [x2, x10] ldr z2, [x12, #1, mul vl] ldr z3, [x13, #1, mul vl] dech x11 add x12, x0, x10 fadd z0.s, z1.s, z0.s fadd z1.s, z3.s, z2.s st1b { z0.b }, p0, [x0, x10] addvl x10, x10, #2 str z1, [x12, #1, mul vl] ``` By adjusting the target features to prefer fixed over scalable if the cost is equal we get the following vectorized loop. ``` ldp q0, q3, [x11, #-16] subs x13, x13, #8 ldp q1, q2, [x10, #-16] add x10, x10, #32 add x11, x11, #32 fadd v0.4s, v1.4s, v0.4s fadd v1.4s, v2.4s, v3.4s stp q0, q1, [x12, #-16] add x12, x12, #32 ``` Which is more efficient.
swift-ci
pushed a commit
that referenced
this pull request
Apr 7, 2025
… A510/A520 (llvm#134606) Recommit. This work was done by llvm#132246 but failed buildbots due to the test introduced needing updates Inefficient SVE codegen occurs on at least two in-order cores, those being Cortex-A510 and Cortex-A520. For example a simple vector add ``` void foo(float a, float b, float dst, unsigned n) { for (unsigned i = 0; i < n; ++i) dst[i] = a[i] + b[i]; } ``` Vectorizes the inner loop into the following interleaved sequence of instructions. ``` add x12, x1, x10 ld1b { z0.b }, p0/z, [x1, x10] add x13, x2, x10 ld1b { z1.b }, p0/z, [x2, x10] ldr z2, [x12, #1, mul vl] ldr z3, [x13, #1, mul vl] dech x11 add x12, x0, x10 fadd z0.s, z1.s, z0.s fadd z1.s, z3.s, z2.s st1b { z0.b }, p0, [x0, x10] addvl x10, x10, #2 str z1, [x12, #1, mul vl] ``` By adjusting the target features to prefer fixed over scalable if the cost is equal we get the following vectorized loop. ``` ldp q0, q3, [x11, #-16] subs x13, x13, #8 ldp q1, q2, [x10, #-16] add x10, x10, #32 add x11, x11, #32 fadd v0.4s, v1.4s, v0.4s fadd v1.4s, v2.4s, v3.4s stp q0, q1, [x12, #-16] add x12, x12, #32 ``` Which is more efficient.
swift-ci
pushed a commit
that referenced
this pull request
Apr 9, 2025
…s=128. (llvm#134068) When compiling with -msve-vector-bits=128 or vscale_range(1, 1) and when the offsets allow it, we can pair SVE LDR/STR instructions into Neon LDP/STP. For example, given: ```cpp #include <arm_sve.h> void foo(double const *ldp, double *stp) { svbool_t pg = svptrue_b64(); svfloat64_t ld1 = svld1_f64(pg, ldp); svfloat64_t ld2 = svld1_f64(pg, ldp+svcntd()); svst1_f64(pg, stp, ld1); svst1_f64(pg, stp+svcntd(), ld2); } ``` When compiled with `-msve-vector-bits=128`, we currently generate: ```gas foo: ldr z0, [x0] ldr z1, [x0, #1, mul vl] str z0, [x1] str z1, [x1, #1, mul vl] ret ``` With this patch, we instead generate: ```gas foo: ldp q0, q1, [x0] stp q0, q1, [x1] ret ``` This is an alternative, more targetted approach to llvm#127500.
swift-ci
pushed a commit
that referenced
this pull request
Apr 9, 2025
…ctor-bits=128." (llvm#134997) Reverts llvm#134068 Caused a stage 2 build failure: https://lab.llvm.org/buildbot/#/builders/41/builds/6016 ``` FAILED: lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/lib/Support -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/lib/Support -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/include -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/include -mcpu=neoverse-512tvb -mllvm -scalable-vectorization=preferred -mllvm -treat-scalable-fixed-error-as-warning=false -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Werror=global-constructors -O3 -DNDEBUG -std=c++17 -UNDEBUG -fno-exceptions -funwind-tables -fno-rtti -MD -MT lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o -MF lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o.d -o lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o -c /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/lib/Support/Caching.cpp Opcode has unknown scale! UNREACHABLE executed at ../llvm/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:4530! PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/lib/Support -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/lib/Support -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage2/include -I/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/include -mcpu=neoverse-512tvb -mllvm -scalable-vectorization=preferred -mllvm -treat-scalable-fixed-error-as-warning=false -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Werror=global-constructors -O3 -DNDEBUG -std=c++17 -UNDEBUG -fno-exceptions -funwind-tables -fno-rtti -MD -MT lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o -MF lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o.d -o lib/Support/CMakeFiles/LLVMSupport.dir/Caching.cpp.o -c /home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/lib/Support/Caching.cpp 1. <eof> parser at end of file 2. Code generation 3. Running pass 'Function Pass Manager' on module '/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/llvm/llvm/lib/Support/Caching.cpp'. 4. Running pass 'AArch64 load / store optimization pass' on function '@"_ZNSt17_Function_handlerIFN4llvm8ExpectedISt8functionIFNS1_ISt10unique_ptrINS0_16CachedFileStreamESt14default_deleteIS4_EEEEjRKNS0_5TwineEEEEEjNS0_9StringRefESB_EZNS0_10localCacheESB_SB_SB_S2_IFvjSB_S3_INS0_12MemoryBufferES5_ISH_EEEEE3$_0E9_M_invokeERKSt9_Any_dataOjOSF_SB_"' #0 0x0000b6eae9b67bf0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x81c7bf0) #1 0x0000b6eae9b65aec llvm::sys::RunSignalHandlers() (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x81c5aec) #2 0x0000b6eae9acd5f4 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0 #3 0x0000f16c1aff28f8 (linux-vdso.so.1+0x8f8) #4 0x0000f16c1aacf1f0 __pthread_kill_implementation ./nptl/pthread_kill.c:44:76 #5 0x0000f16c1aa8a67c gsignal ./signal/../sysdeps/posix/raise.c:27:6 #6 0x0000f16c1aa77130 abort ./stdlib/abort.c:81:7 #7 0x0000b6eae9ad6628 (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x8136628) #8 0x0000b6eae72e95a8 (/home/tcwg-buildbot/worker/clang-aarch64-sve-vla-2stage/stage1.install/bin/clang+++0x59495a8) #9 0x0000b6eae74ca9a8 (anonymous namespace)::AArch64LoadStoreOpt::findMatchingInsn(llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>, (anonymous namespace)::LdStPairFlags&, unsigned int, bool) AArch64LoadStoreOptimizer.cpp:0:0 #10 0x0000b6eae74c85a8 (anonymous namespace)::AArch64LoadStoreOpt::tryToPairLdStInst(llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>&) AArch64LoadStoreOptimizer.cpp:0:0 #11 0x0000b6eae74c624c (anonymous namespace)::AArch64LoadStoreOpt::optimizeBlock(llvm::MachineBasicBlock&, bool) AArch64LoadStoreOptimizer.cpp:0:0 #12 0x0000b6eae74c429c (anonymous namespace)::AArch64LoadStoreOpt::runOnMachineFunction(llvm::MachineFunction&) AArch64LoadStoreOptimizer.cpp:0:0 ```
swift-ci
pushed a commit
that referenced
this pull request
Apr 13, 2025
…vailable (llvm#135343) When a frame is inlined, LLDB will display its name in backtraces as follows: ``` * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.3 * frame #0: 0x0000000100000398 a.out`func() [inlined] baz(x=10) at inline.cpp:1:42 frame #1: 0x0000000100000398 a.out`func() [inlined] bar() at inline.cpp:2:37 frame #2: 0x0000000100000398 a.out`func() at inline.cpp:4:15 frame #3: 0x00000001000003c0 a.out`main at inline.cpp:7:5 frame #4: 0x000000026eb29ab8 dyld`start + 6812 ``` The longer the names get the more confusing this gets because the first function name that appears is the parent frame. My assumption (which may need some more surveying) is that for the majority of cases we only care about the actual frame name (not the parent). So this patch removes all the special logic that prints the parent frame. Another quirk of the current format is that the inlined frame name does not abide by the `${function.name-XXX}` format variables. We always just print the raw demangled name. With this patch, we would format the inlined frame name according to the `frame-format` setting (see the test-cases). If we really want to have the `parentFrame [inlined] inlinedFrame` format, we could expose it through a new `frame-format` variable (e..g., `${function.inlined-at-name}` and let the user decide where to place things.
swift-ci
pushed a commit
that referenced
this pull request
Apr 17, 2025
Currently, given: ```cpp uint64_t incb(uint64_t x) { return x+svcntb(); } ``` LLVM generates: ```gas incb: addvl x0, x0, #1 ret ``` Which is equivalent to: ```gas incb: incb x0 ret ``` However, on microarchitectures like the Neoverse V2 and Neoverse V3, the second form (with INCB) can have significantly better latency and throughput (according to their SWOG). On the Neoverse V2, for example, ADDVL has a latency and throughput of 2, whereas some forms of INCB have a latency of 1 and a throughput of 4. The same applies to DECB. This patch adds patterns to prefer the cheaper INCB/DECB forms over ADDVL where applicable.
swift-ci
pushed a commit
that referenced
this pull request
Apr 18, 2025
- Avoid dereferencing the end() iterator to get the end pointer, instead calculate it explicitly - Fixes a regression introduced in llvm#136220. - The windows build failure shows the following call stack: ``` | Exception Code: 0x80000003 | #0 0x00007ff74bc05897 std::_Vector_const_iterator<class std::_Vector_val<struct std::_Simple_types<unsigned char>>>::operator*(void) const C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.37.32822\include\vector:52:0 | #1 0x00007ff74bbd3d64 `anonymous namespace'::DecoderEmitter::emitTable D:\buildbot\llvm-worker\clang-cmake-x86_64-avx512-win\llvm\llvm\utils\TableGen\DecoderEmitter.cpp:852:0 ```
swift-ci
pushed a commit
that referenced
this pull request
Apr 25, 2025
… collection (llvm#136795) Fix a [test failure](llvm#136236 (comment)) in llvm#136236, apply a minor renaming of statistics, and remerge. See details below. # Changes in llvm#136236 Currently, `DebuggerStats::ReportStatistics()` calls `Module::GetSymtab(/*can_create=*/false)`, but then the latter calls `SymbolFile::GetSymtab()`. This will load symbols if haven't yet. See stacktrace below. The problem is that `DebuggerStats::ReportStatistics` should be read-only. This is especially important because it reports stats for symtab parsing/indexing time, which could be affected by the reporting itself if it's not read-only. This patch fixes this problem by adding an optional parameter `SymbolFile::GetSymtab(bool can_create = true)` and receiving the `false` value passed down from `Module::GetSymtab(/*can_create=*/false)` when the call is initiated from `DebuggerStats::ReportStatistics()`. --- Notes about the following stacktrace: 1. This can be reproduced. Create a helloworld program on **macOS** with dSYM, add `settings set target.preload-symbols false` to `~/.lldbinit`, do `lldb a.out`, then `statistics dump`. 2. `ObjectFile::GetSymtab` has `llvm::call_once`. So the fact that it called into `ObjectFileMachO::ParseSymtab` means that the symbol table is actually being parsed. ``` (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = step over frame #0: 0x0000000124c4d5a0 LLDB`ObjectFileMachO::ParseSymtab(this=0x0000000111504e40, symtab=0x0000600000a05e00) at ObjectFileMachO.cpp:2259:44 * frame #1: 0x0000000124fc50a0 LLDB`lldb_private::ObjectFile::GetSymtab()::$_0::operator()(this=0x000000016d35c858) const at ObjectFile.cpp:761:9 frame #5: 0x0000000124fc4e68 LLDB`void std::__1::__call_once_proxy[abi:v160006]<std::__1::tuple<lldb_private::ObjectFile::GetSymtab()::$_0&&>>(__vp=0x000000016d35c7f0) at mutex:652:5 frame #6: 0x0000000198afb99c libc++.1.dylib`std::__1::__call_once(unsigned long volatile&, void*, void (*)(void*)) + 196 frame #7: 0x0000000124fc4dd0 LLDB`void std::__1::call_once[abi:v160006]<lldb_private::ObjectFile::GetSymtab()::$_0>(__flag=0x0000600003920080, __func=0x000000016d35c858) at mutex:670:9 frame #8: 0x0000000124fc3cb0 LLDB`void llvm::call_once<lldb_private::ObjectFile::GetSymtab()::$_0>(flag=0x0000600003920080, F=0x000000016d35c858) at Threading.h:88:5 frame #9: 0x0000000124fc2bc4 LLDB`lldb_private::ObjectFile::GetSymtab(this=0x0000000111504e40) at ObjectFile.cpp:755:5 frame #10: 0x0000000124fe0a28 LLDB`lldb_private::SymbolFileCommon::GetSymtab(this=0x0000000104865200) at SymbolFile.cpp:158:39 frame #11: 0x0000000124d8fedc LLDB`lldb_private::Module::GetSymtab(this=0x00000001113041a8, can_create=false) at Module.cpp:1027:21 frame #12: 0x0000000125125bdc LLDB`lldb_private::DebuggerStats::ReportStatistics(debugger=0x000000014284d400, target=0x0000000115808200, options=0x000000014195d6d1) at Statistics.cpp:329:30 frame #13: 0x0000000125672978 LLDB`CommandObjectStatsDump::DoExecute(this=0x000000014195d540, command=0x000000016d35d820, result=0x000000016d35e150) at CommandObjectStats.cpp:144:18 frame #14: 0x0000000124f29b40 LLDB`lldb_private::CommandObjectParsed::Execute(this=0x000000014195d540, args_string="", result=0x000000016d35e150) at CommandObject.cpp:832:9 frame #15: 0x0000000124efbd70 LLDB`lldb_private::CommandInterpreter::HandleCommand(this=0x0000000141b22f30, command_line="statistics dump", lazy_add_to_history=eLazyBoolCalculate, result=0x000000016d35e150, force_repeat_command=false) at CommandInterpreter.cpp:2134:14 frame #16: 0x0000000124f007f4 LLDB`lldb_private::CommandInterpreter::IOHandlerInputComplete(this=0x0000000141b22f30, io_handler=0x00000001419b2aa8, line="statistics dump") at CommandInterpreter.cpp:3251:3 frame #17: 0x0000000124d7b5ec LLDB`lldb_private::IOHandlerEditline::Run(this=0x00000001419b2aa8) at IOHandler.cpp:588:22 frame #18: 0x0000000124d1e8fc LLDB`lldb_private::Debugger::RunIOHandlers(this=0x000000014284d400) at Debugger.cpp:1225:16 frame #19: 0x0000000124f01f74 LLDB`lldb_private::CommandInterpreter::RunCommandInterpreter(this=0x0000000141b22f30, options=0x000000016d35e63c) at CommandInterpreter.cpp:3543:16 frame #20: 0x0000000122840294 LLDB`lldb::SBDebugger::RunCommandInterpreter(this=0x000000016d35ebd8, auto_handle_events=true, spawn_thread=false) at SBDebugger.cpp:1212:42 frame #21: 0x0000000102aa6d28 lldb`Driver::MainLoop(this=0x000000016d35ebb8) at Driver.cpp:621:18 frame #22: 0x0000000102aa75b0 lldb`main(argc=1, argv=0x000000016d35f548) at Driver.cpp:829:26 frame #23: 0x0000000198858274 dyld`start + 2840 ``` # Changes in this PR top of the above Fix a [test failure](llvm#136236 (comment)) in `TestStats.py`. The original version of the added test checks that all modules have symbol count zero when `target.preload-symbols == false`. The test failed on macOS. Due to various reasons, on macOS, symbols can be loaded for dylibs even with that setting, but not for the main module. For now, the fix of the test is to limit the assertion to only the main module. The test now passes on macOS. In the future, when we have a way to control a specific list of plug-ins to be loaded, there may be a configuration that this test can use to assert that all modules have symbol count zero. Apply a minor renaming of statistics, per the [suggestion](llvm#136226 (comment)) in llvm#136226 after merge.
swift-ci
pushed a commit
that referenced
this pull request
Apr 25, 2025
…mbolConjured" (llvm#137304) Reverts llvm#128251 ASAN bots reported some errors: https://lab.llvm.org/buildbot/#/builders/55/builds/10398 Reverting for investigation. ``` Failed Tests (6): Clang :: Analysis/loop-widening-ignore-static-methods.cpp Clang :: Analysis/loop-widening-notes.cpp Clang :: Analysis/loop-widening-preserve-reference-type.cpp Clang :: Analysis/loop-widening.c Clang :: Analysis/loop-widening.cpp Clang :: Analysis/this-pointer.cpp Testing Time: 411.55s Total Discovered Tests: 118563 Skipped : 33 (0.03%) Unsupported : 2015 (1.70%) Passed : 116291 (98.08%) Expectedly Failed: 218 (0.18%) Failed : 6 (0.01%) FAILED: CMakeFiles/check-all /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/CMakeFiles/check-all cd /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan && /usr/bin/python3 /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/./bin/llvm-lit -sv --param USE_Z3_SOLVER=0 /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/utils/mlgo-utils /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/lld/test /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/mlir/test /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/clang/test /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/utils/lit /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/test ninja: build stopped: subcommand failed. ``` ``` /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/clang -cc1 -internal-isystem /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/lib/clang/21/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,unix.Malloc,debug.ExprInspection -analyzer-max-loop 4 -analyzer-config widen-loops=true -verify -analyzer-config eagerly-assume=false /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/test/Analysis/loop-widening.c # RUN: at line 1 + /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/clang -cc1 -internal-isystem /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/lib/clang/21/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,unix.Malloc,debug.ExprInspection -analyzer-max-loop 4 -analyzer-config widen-loops=true -verify -analyzer-config eagerly-assume=false /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/test/Analysis/loop-widening.c PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/clang -cc1 -internal-isystem /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/lib/clang/21/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,unix.Malloc,debug.ExprInspection -analyzer-max-loop 4 -analyzer-config widen-loops=true -verify -analyzer-config eagerly-assume=false /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/test/Analysis/loop-widening.c 1. <eof> parser at end of file 2. While analyzing stack: #0 Calling nested_loop_inner_widen #0 0x0000c894cca289cc llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:804:13 #1 0x0000c894cca23324 llvm::sys::RunSignalHandlers() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Signals.cpp:106:18 #2 0x0000c894cca29bbc SignalHandler(int, siginfo_t*, void*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3 #3 0x0000f6898da4a8f8 (linux-vdso.so.1+0x8f8) #4 0x0000f6898d377608 (/lib/aarch64-linux-gnu/libc.so.6+0x87608) #5 0x0000f6898d32cb3c raise (/lib/aarch64-linux-gnu/libc.so.6+0x3cb3c) #6 0x0000f6898d317e00 abort (/lib/aarch64-linux-gnu/libc.so.6+0x27e00) #7 0x0000c894c5e77fec __sanitizer::Atexit(void (*)()) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp:168:10 #8 0x0000c894c5e76680 __sanitizer::Die() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_termination.cpp:52:5 #9 0x0000c894c5e69650 Unlock /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_mutex.h:250:16 #10 0x0000c894c5e69650 ~GenericScopedLock /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_mutex.h:386:51 #11 0x0000c894c5e69650 __hwasan::ScopedReport::~ScopedReport() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_report.cpp:54:5 #12 0x0000c894c5e68de0 __hwasan::(anonymous namespace)::BaseReport::~BaseReport() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_report.cpp:476:7 #13 0x0000c894c5e66b74 __hwasan::ReportTagMismatch(__sanitizer::StackTrace*, unsigned long, unsigned long, bool, bool, unsigned long*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_report.cpp:1091:1 #14 0x0000c894c5e52cf8 Destroy /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_common.h:532:31 #15 0x0000c894c5e52cf8 ~InternalMmapVector /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_common.h:642:56 #16 0x0000c894c5e52cf8 __hwasan::HandleTagMismatch(__hwasan::AccessInfo, unsigned long, unsigned long, void*, unsigned long*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan.cpp:245:1 #17 0x0000c894c5e551c8 __hwasan_tag_mismatch4 /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan.cpp:764:1 #18 0x0000c894c5e6a2f8 __interception::InterceptFunction(char const*, unsigned long*, unsigned long, unsigned long) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/interception/interception_linux.cpp:60:0 #19 0x0000c894d166f664 getBlock /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CoreEngine.h:217:45 #20 0x0000c894d166f664 getCFGElementRef /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h:230:59 #21 0x0000c894d166f664 clang::ento::ExprEngine::processCFGBlockEntrance(clang::BlockEdge const&, clang::ento::NodeBuilderWithSinks&, clang::ento::ExplodedNode*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp:2570:45 #22 0x0000c894d15f3a1c hasGeneratedNodes /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CoreEngine.h:333:37 #23 0x0000c894d15f3a1c clang::ento::CoreEngine::HandleBlockEdge(clang::BlockEdge const&, clang::ento::ExplodedNode*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp:319:20 #24 0x0000c894d15f2c34 clang::ento::CoreEngine::dispatchWorkItem(clang::ento::ExplodedNode*, clang::ProgramPoint, clang::ento::WorkListUnit const&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp:220:7 #25 0x0000c894d15f2398 operator-> /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__memory/unique_ptr.h:267:101 #26 0x0000c894d15f2398 clang::ento::CoreEngine::ExecuteWorkList(clang::LocationContext const*, unsigned int, llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>)::$_0::operator()(unsigned int) const /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp:140:12 #27 0x0000c894d15f14b4 clang::ento::CoreEngine::ExecuteWorkList(clang::LocationContext const*, unsigned int, llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp:165:7 #28 0x0000c894d0ebb9dc release /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:232:9 #29 0x0000c894d0ebb9dc ~IntrusiveRefCntPtr /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:196:27 #30 0x0000c894d0ebb9dc ExecuteWorkList /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h:192:5 #31 0x0000c894d0ebb9dc RunPathSensitiveChecks /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:772:7 #32 0x0000c894d0ebb9dc (anonymous namespace)::AnalysisConsumer::HandleCode(clang::Decl*, unsigned int, clang::ento::ExprEngine::InliningModes, llvm::DenseSet<clang::Decl const*, llvm::DenseMapInfo<clang::Decl const*, void>>*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:741:5 #33 0x0000c894d0eb6ee4 begin /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/ADT/DenseMap.h:0:0 #34 0x0000c894d0eb6ee4 begin /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/ADT/DenseSet.h:187:45 #35 0x0000c894d0eb6ee4 HandleDeclsCallGraph /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:516:29 #36 0x0000c894d0eb6ee4 runAnalysisOnTranslationUnit /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:584:5 #37 0x0000c894d0eb6ee4 (anonymous namespace)::AnalysisConsumer::HandleTranslationUnit(clang::ASTContext&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:647:3 #38 0x0000c894d18a7a38 clang::ParseAST(clang::Sema&, bool, bool) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/Parse/ParseAST.cpp:0:13 #39 0x0000c894ce81ed70 clang::FrontendAction::Execute() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1231:10 #40 0x0000c894ce6f2144 getPtr /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Support/Error.h:278:42 #41 0x0000c894ce6f2144 operator bool /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Support/Error.h:241:16 #42 0x0000c894ce6f2144 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1058:23 #43 0x0000c894cea718cc operator-> /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__memory/shared_ptr.h:635:12 #44 0x0000c894cea718cc getFrontendOpts /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/include/clang/Frontend/CompilerInstance.h:307:12 #45 0x0000c894cea718cc clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:301:14 #46 0x0000c894c5e9cf28 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/tools/driver/cc1_main.cpp:294:15 #47 0x0000c894c5e92a9c ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/tools/driver/driver.cpp:223:12 #48 0x0000c894c5e902ac clang_main(int, char**, llvm::ToolContext const&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/tools/driver/driver.cpp:0:12 #49 0x0000c894c5eb2e34 main /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/clang/tools/driver/clang-driver.cpp:17:3 #50 0x0000f6898d3184c4 (/lib/aarch64-linux-gnu/libc.so.6+0x284c4) #51 0x0000f6898d318598 __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x28598) #52 0x0000c894c5e52a30 _start (/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/clang+0x6512a30) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/clang/test/Analysis/Output/loop-widening.c.script: line 2: 2870204 Aborted /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/clang -cc1 -internal-isystem /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/lib/clang/21/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,unix.Malloc,debug.ExprInspection -analyzer-max-loop 4 -analyzer-config widen-loops=true -verify -analyzer-config eagerly-assume=false /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/test/Analysis/loop-widening.c ```
swift-ci
pushed a commit
that referenced
this pull request
Apr 26, 2025
`clang-repl --cuda` was previously crashing with a segmentation fault, instead of reporting a clean error ``` (base) anutosh491@Anutoshs-MacBook-Air bin % ./clang-repl --cuda #0 0x0000000111da4fbc llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x150fbc) #1 0x0000000111da31dc llvm::sys::RunSignalHandlers() (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x14f1dc) #2 0x0000000111da5628 SignalHandler(int) (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x151628) #3 0x000000019b242de4 (/usr/lib/system/libsystem_platform.dylib+0x180482de4) #4 0x0000000107f638d0 clang::IncrementalCUDADeviceParser::IncrementalCUDADeviceParser(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, clang::CompilerInstance&, llvm::IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem>, llvm::Error&, std::__1::list<clang::PartialTranslationUnit, std::__1::allocator<clang::PartialTranslationUnit>> const&) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x216b8d0) #5 0x0000000107f638d0 clang::IncrementalCUDADeviceParser::IncrementalCUDADeviceParser(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, clang::CompilerInstance&, llvm::IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem>, llvm::Error&, std::__1::list<clang::PartialTranslationUnit, std::__1::allocator<clang::PartialTranslationUnit>> const&) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x216b8d0) #6 0x0000000107f6bac8 clang::Interpreter::createWithCUDA(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x2173ac8) #7 0x000000010206f8a8 main (/opt/local/libexec/llvm-20/bin/clang-repl+0x1000038a8) #8 0x000000019ae8c274 Segmentation fault: 11 ``` The underlying issue was that the `DeviceCompilerInstance` (used for device-side CUDA compilation) was never initialized with a `Sema`, which is required before constructing the `IncrementalCUDADeviceParser`. https://github.com/llvm/llvm-project/blob/89687e6f383b742a3c6542dc673a84d9f82d02de/clang/lib/Interpreter/DeviceOffload.cpp#L32 https://github.com/llvm/llvm-project/blob/89687e6f383b742a3c6542dc673a84d9f82d02de/clang/lib/Interpreter/IncrementalParser.cpp#L31 Unlike the host-side `CompilerInstance` which runs `ExecuteAction` inside the Interpreter constructor (thereby setting up Sema), the device-side CI was passed into the parser uninitialized, leading to an assertion or crash when accessing its internals. To fix this, I refactored the `Interpreter::create` method to include an optional `DeviceCI` parameter. If provided, we know we need to take care of this instance too. Only then do we construct the `IncrementalCUDADeviceParser`.
swift-ci
pushed a commit
that referenced
this pull request
Apr 26, 2025
This were failing on Windows CI with errors like: ``` 22: (lldb) bt 23: * thread #1, stop reason = breakpoint 1.1 24: frame #0: 0x00007ff7c5e41000 TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`func at main.m:2 25: frame #1: 0x00007ff7c5e4101c TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`bar + 12 at main.m:3 26: frame #2: 0x00007ff7c5e4103c TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`main + 16 at main.m:5 27: custom-frame '()' !~~~~~~~~~~~ error: no match expected 28: custom-frame '(__formal=<unavailable>)' ```
Michael137
added a commit
that referenced
this pull request
Apr 26, 2025
…vailable (llvm#135343) When a frame is inlined, LLDB will display its name in backtraces as follows: ``` * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.3 * frame #0: 0x0000000100000398 a.out`func() [inlined] baz(x=10) at inline.cpp:1:42 frame #1: 0x0000000100000398 a.out`func() [inlined] bar() at inline.cpp:2:37 frame #2: 0x0000000100000398 a.out`func() at inline.cpp:4:15 frame #3: 0x00000001000003c0 a.out`main at inline.cpp:7:5 frame #4: 0x000000026eb29ab8 dyld`start + 6812 ``` The longer the names get the more confusing this gets because the first function name that appears is the parent frame. My assumption (which may need some more surveying) is that for the majority of cases we only care about the actual frame name (not the parent). So this patch removes all the special logic that prints the parent frame. Another quirk of the current format is that the inlined frame name does not abide by the `${function.name-XXX}` format variables. We always just print the raw demangled name. With this patch, we would format the inlined frame name according to the `frame-format` setting (see the test-cases). If we really want to have the `parentFrame [inlined] inlinedFrame` format, we could expose it through a new `frame-format` variable (e..g., `${function.inlined-at-name}` and let the user decide where to place things. (cherry picked from commit 1e153b7)
Michael137
added a commit
that referenced
this pull request
Apr 26, 2025
This were failing on Windows CI with errors like: ``` 22: (lldb) bt 23: * thread #1, stop reason = breakpoint 1.1 24: frame #0: 0x00007ff7c5e41000 TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`func at main.m:2 25: frame #1: 0x00007ff7c5e4101c TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`bar + 12 at main.m:3 26: frame #2: 0x00007ff7c5e4103c TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`main + 16 at main.m:5 27: custom-frame '()' !~~~~~~~~~~~ error: no match expected 28: custom-frame '(__formal=<unavailable>)' ``` (cherry picked from commit cc0bdb3)
Michael137
added a commit
that referenced
this pull request
Apr 26, 2025
This were failing on Windows CI with errors like: ``` 22: (lldb) bt 23: * thread #1, stop reason = breakpoint 1.1 24: frame #0: 0x00007ff7c5e41000 TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`func at main.m:2 25: frame #1: 0x00007ff7c5e4101c TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`bar + 12 at main.m:3 26: frame #2: 0x00007ff7c5e4103c TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`main + 16 at main.m:5 27: custom-frame '()' !~~~~~~~~~~~ error: no match expected 28: custom-frame '(__formal=<unavailable>)' ``` (cherry picked from commit cc0bdb3)
Michael137
added a commit
that referenced
this pull request
Apr 26, 2025
This were failing on Windows CI with errors like: ``` 22: (lldb) bt 23: * thread #1, stop reason = breakpoint 1.1 24: frame #0: 0x00007ff7c5e41000 TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`func at main.m:2 25: frame #1: 0x00007ff7c5e4101c TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`bar + 12 at main.m:3 26: frame #2: 0x00007ff7c5e4103c TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`main + 16 at main.m:5 27: custom-frame '()' !~~~~~~~~~~~ error: no match expected 28: custom-frame '(__formal=<unavailable>)' ``` (cherry picked from commit cc0bdb3)
Michael137
added a commit
that referenced
this pull request
Apr 27, 2025
This were failing on Windows CI with errors like: ``` 22: (lldb) bt 23: * thread #1, stop reason = breakpoint 1.1 24: frame #0: 0x00007ff7c5e41000 TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`func at main.m:2 25: frame #1: 0x00007ff7c5e4101c TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`bar + 12 at main.m:3 26: frame #2: 0x00007ff7c5e4103c TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`main + 16 at main.m:5 27: custom-frame '()' !~~~~~~~~~~~ error: no match expected 28: custom-frame '(__formal=<unavailable>)' ``` (cherry picked from commit cc0bdb3)
swift-ci
pushed a commit
that referenced
this pull request
Apr 29, 2025
… without debug-info" (llvm#137757) Reverts llvm#137408 This change broke `lldb/test/Shell/Unwind/split-machine-functions.test`. The test binary has a symbol named `_Z3foov.cold` and the test expects the backtrace to print the name of the cold part of the function like this: ``` # SPLIT: frame #1: {{.*}}`foo() (.cold) + ``` but now it gets ``` frame #1: 0x000055555555514f split-machine-functions.test.tmp`foo() + 12 ```
swift-ci
pushed a commit
that referenced
this pull request
Apr 29, 2025
`clang-repl --cuda` was previously crashing with a segmentation fault, instead of reporting a clean error ``` (base) anutosh491@Anutoshs-MacBook-Air bin % ./clang-repl --cuda #0 0x0000000111da4fbc llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x150fbc) #1 0x0000000111da31dc llvm::sys::RunSignalHandlers() (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x14f1dc) #2 0x0000000111da5628 SignalHandler(int) (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x151628) #3 0x000000019b242de4 (/usr/lib/system/libsystem_platform.dylib+0x180482de4) #4 0x0000000107f638d0 clang::IncrementalCUDADeviceParser::IncrementalCUDADeviceParser(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, clang::CompilerInstance&, llvm::IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem>, llvm::Error&, std::__1::list<clang::PartialTranslationUnit, std::__1::allocator<clang::PartialTranslationUnit>> const&) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x216b8d0) #5 0x0000000107f638d0 clang::IncrementalCUDADeviceParser::IncrementalCUDADeviceParser(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, clang::CompilerInstance&, llvm::IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem>, llvm::Error&, std::__1::list<clang::PartialTranslationUnit, std::__1::allocator<clang::PartialTranslationUnit>> const&) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x216b8d0) #6 0x0000000107f6bac8 clang::Interpreter::createWithCUDA(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x2173ac8) #7 0x000000010206f8a8 main (/opt/local/libexec/llvm-20/bin/clang-repl+0x1000038a8) #8 0x000000019ae8c274 Segmentation fault: 11 ``` The underlying issue was that the `DeviceCompilerInstance` (used for device-side CUDA compilation) was never initialized with a `Sema`, which is required before constructing the `IncrementalCUDADeviceParser`. https://github.com/llvm/llvm-project/blob/89687e6f383b742a3c6542dc673a84d9f82d02de/clang/lib/Interpreter/DeviceOffload.cpp#L32 https://github.com/llvm/llvm-project/blob/89687e6f383b742a3c6542dc673a84d9f82d02de/clang/lib/Interpreter/IncrementalParser.cpp#L31 Unlike the host-side `CompilerInstance` which runs `ExecuteAction` inside the Interpreter constructor (thereby setting up Sema), the device-side CI was passed into the parser uninitialized, leading to an assertion or crash when accessing its internals. To fix this, I refactored the `Interpreter::create` method to include an optional `DeviceCI` parameter. If provided, we know we need to take care of this instance too. Only then do we construct the `IncrementalCUDADeviceParser`. (cherry picked from commit 21fb19f)
swift-ci
pushed a commit
that referenced
this pull request
May 2, 2025
…gger memory is updated (llvm#129092)" This reverts commit daa4061. Original PR llvm#129092. I have restricted the test to X86 Windows because it turns out the only reason that `expr x.get()` would change m_memory_id is that on x86 we have to write the return address to the stack in ABIWindows_X86_64::PrepareTrivialCall: ``` // Save return address onto the stack if (!process_sp->WritePointerToMemory(sp, return_addr, error)) return false; ``` This is not required on AArch64 so m_memory_id was not changed: ``` (lldb) expr x.get() (int) $0 = 0 (lldb) process status -d Process 15316 stopped * thread #1, stop reason = Exception 0x80000003 encountered at address 0x7ff764a31034 frame #0: 0x00007ff764a31038 TestProcessModificationIdOnExpr.cpp.tmp`main at TestProcessModificationIdOnExpr.cpp:35 32 __builtin_debugtrap(); 33 __builtin_debugtrap(); 34 return 0; -> 35 } 36 37 // CHECK-LABEL: process status -d 38 // CHECK: m_stop_id: 2 ProcessModID: m_stop_id: 3 m_last_natural_stop_id: 0 m_resume_id: 0 m_memory_id: 0 ``` Really we should find a better way to force a memory write here, but I can't think of one right now.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.