-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Miscompiled C++ code for mingw/x86_64 target #64253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Attached is a less obfuscated version of the reduced testcase, where the symbol names haven't been changed - making it hopefully slightly more understandable, and having more likeness to the libc++ original: |
CC @efriedma-quic who commented on this before in https://reviews.llvm.org/D135462 - hopefully you can help at least with looping in the right people to look at this. I've got a fair bit of data pointing towards this being a Clang/frontend issue, not specific to the codegen backends. CC @AaronBallman who hopefully is good at spotting a bug or might know who might be good at it. And CC @rnk who generally is knowledgeable on the Windows specific quirks. To look closer at the issue, I've split the testcase into three files; If compiled for a mingw target, we get the failure: $ clang -target x86_64-w64-mingw32 -std=c++2b get.cpp get-main.cpp -o get.exe; ./get.exe
rv[0].x = 0 - incorrect With optimization, it is ok: $ clang -target x86_64-w64-mingw32 -std=c++2b get.cpp get-main.cpp -o get.exe -O2; ./get.exe
rv[0].x = 111 - ok If we split out generating the unoptimized IR for $ clang -target x86_64-w64-mingw32 -std=c++2b get.cpp -S -emit-llvm -Xclang -disable-llvm-passes -o get.ll
$ llc -mtriple=x86_64-w64-mingw32 get.ll -filetype obj -o get.o
$ clang -target x86_64-w64-mingw32 get.o get-main.cpp -o get.exe; ./get.exe
rv[0].x = 0 - incorrect At this point, the generated IR in To further show that the fault lies in the IR from Generate IR for an $ clang -target x86_64-w64-mingw32 get.cpp -std=c++2b -S -emit-llvm -Xclang -disable-llvm-passes -o get.ll
$ llc -mtriple=x86_64-linux-gnu get.ll -filetype obj -o get.o
$ clang -target x86_64-linux-gnu get.o get-main.cpp -o get.exe -Wl,--no-pie; ./get.exe
rv[0].x = 0 - incorrect Also doing the same in reverse - generating IR for $ clang -target x86_64-linux-gnu get.cpp -std=c++2b -S -emit-llvm -Xclang -disable-llvm-passes -o get.ll
$ llc -mtriple=x86_64-w64-mingw32 get.ll -filetype obj -o get.o
$ clang -target x86_64-w64-mingw32 get.o get-main.cpp -o get.exe; ./get.exe
rv[0].x = 111 - ok To make things more interesting, let's try compiling $ clang -target x86_64-windows-msvc get.cpp -std=c++2b -S -emit-llvm -Xclang -disable-llvm-passes -o get.ll
In file included from get.cpp:4:
./range.cpp:136:5: warning: unknown attribute '__no_unique_address__' ignored [-Wunknown-attributes]
136 | [[__no_unique_address__]] _Bound __bound_;
| ^~~~~~~~~~~~~~~~~~~~~
1 warning generated.
$ llc -mtriple=x86_64-w64-mingw32 get.ll -filetype obj -o get.o
$ clang -target x86_64-w64-mingw32 get.o get-main.cpp -o get.exe; ./get.exe
rv[0].x = 111 - ok The Further attaching The same issue also manifests if generating the IR with an older (16 release) version of Clang but compiling it with a newer version of llc (past c65b4d6). Here's a Compiler Explorer link with the generated IR for this case: https://gcc.godbolt.org/z/KxxMxMhPa |
Maybe you can try selectively enabling/disabling the change from D135462 for specific functions, and see which change in code generation is actually causing the bug to show up? Looking at the generated code, I can't see any reason for the alignment to have any effect; the most likely issue is some sort of buffer overflow. |
D135462 is supposed to have no effect on generated LLVM IR - it changes SelectionDAG and below. You can dump assembly output from llc and compare it with and without the patch. Difference should only be around stack layout. Another approach to reduce this case is to specify stack alignment in source code using |
Although for this C++ code it is not clear where to put the attribute. It is probably easier to manually change alignment of |
Yep - it seems like the generated IR always (or at least since the 16 release) has had this flaw, and only recent changes expose the breakage. I'll see if it's possible to diff the generated IR between the working linux case and broken mingw case, or if there's too much differences to spot what might be relevant.
That sounds like a good idea! I guess I could also try to diff the output from
Thanks for the idea, I'll try that if I don't make progress other ways.
Yeah, it feels like something such... |
There's a lot of differences there, too much to spot something relevant right off the bat.
This turned out to give some fairly interesting pointers. Enabling the change from D135462 changed the outcome of the test when activated on the functions The latter of these has the following diff in output assembly: _ZNSt12_GLOBAL__N_122__make_from_tuple_implINS_22unreachable_sentinel_tENS_5tupleIJS1_EEEEET_T0_: # @_ZNSt12_GLOBAL__N_122__make_from_tuple_implINS_22unreachable_sentinel_tENS_5tupleIJS1_EEEEET_T0_
.seh_proc _ZNSt12_GLOBAL__N_122__make_from_tuple_implINS_22unreachable_sentinel_tENS_5tupleIJS1_EEEEET_T0_
# %bb.0:
- subq $16, %rsp
- .seh_stackalloc 16
+ pushq %rax
+ .seh_stackalloc 8
.seh_endprologue
- movb %cl, (%rsp)
- movb 8(%rsp), %al
- addq $16, %rsp
+ movb %cl, 6(%rsp)
+ movb 7(%rsp), %al
+ popq %rcx
retq
.seh_endproc The only difference here is the layout of the stack variable. Other than that, the function seems to be doing total nonsense - it writes one byte from the input parameter into one location on the stack, This seems to revolve around empty classes. struct unreachable_sentinel_t {
} unreachable_sentinel; In the LLVM IR, it is defined like this: %"struct.std::(anonymous namespace)::unreachable_sentinel_t" = type { i8 } It is defined in the same way for both the mingw and linux cases. The mingw IR for this function looks like this: ; Function Attrs: mustprogress noinline nounwind optnone uwtable
define internal i8 @_ZNSt12_GLOBAL__N_122__make_from_tuple_implINS_22unreachable_sentinel_tENS_5tupleIJS1_EEEEET_T0_(i8 %0) #2 {
%2 = alloca %"struct.std::(anonymous namespace)::unreachable_sentinel_t", align 1
%3 = alloca %"class.std::(anonymous namespace)::tuple.0", align 1
%4 = getelementptr inbounds %"class.std::(anonymous namespace)::tuple.0", ptr %3, i32 0, i32 0
%5 = getelementptr inbounds %"struct.std::(anonymous namespace)::__tuple_impl.1", ptr %4, i32 0, i32 0
store i8 %0, ptr %5, align 1
%6 = getelementptr inbounds %"struct.std::(anonymous namespace)::unreachable_sentinel_t", ptr %2, i32 0, i32 0
%7 = load i8, ptr %6, align 1
ret i8 %7
} For the linux case, the function does exist but looks entirely different: ; Function Attrs: mustprogress noinline nounwind optnone uwtable
define internal void @_ZNSt12_GLOBAL__N_122__make_from_tuple_implINS_22unreachable_sentinel_tENS_5tupleIJS1_EEEEET_T0_() #2 {
%1 = alloca %"class.std::(anonymous namespace)::tuple.0", align 1
ret void
} So in the linux case, the clang codegen knows to skip loads/stores to this struct entirely, while on mingw it generates these dummy loads/stores, even if it's probably not meant to be included in allocations at all. I tried stepping around the code in a debugger, and managed to pinpoint the issue further. The smoking gun seems to reside in this function: ; Function Attrs: noinline optnone uwtable
define internal void @_ZNSt12_GLOBAL__N_16ranges11repeat_viewI1ANS_22unreachable_sentinel_tEEC2IJS3_EEEiNS_5tupleIJEEENS6_IJDpT_EEE(ptr noundef nonnull align 4 dereferenceable(8) %0, i32 noundef %1, i8 %2, i8 %3) unnamed_addr #1 align 2 {
%5 = alloca %"class.std::(anonymous namespace)::tuple", align 1
%6 = alloca %"class.std::(anonymous namespace)::tuple.0", align 1
%7 = alloca ptr, align 8
%8 = alloca i32, align 4
%9 = alloca %struct.A, align 4
%10 = alloca %"class.std::(anonymous namespace)::tuple", align 1
%11 = alloca %"class.std::(anonymous namespace)::tuple.0", align 1
%12 = getelementptr inbounds %"class.std::(anonymous namespace)::tuple", ptr %5, i32 0, i32 0
%13 = getelementptr inbounds %"struct.std::(anonymous namespace)::__tuple_impl", ptr %12, i32 0, i32 0
store i8 %2, ptr %13, align 1
%14 = getelementptr inbounds %"class.std::(anonymous namespace)::tuple.0", ptr %6, i32 0, i32 0
%15 = getelementptr inbounds %"struct.std::(anonymous namespace)::__tuple_impl.1", ptr %14, i32 0, i32 0
store i8 %3, ptr %15, align 1
store ptr %0, ptr %7, align 8
store i32 %1, ptr %8, align 4
%16 = load ptr, ptr %7, align 8
%17 = getelementptr inbounds %"class.std::(anonymous namespace)::ranges::repeat_view", ptr %16, i32 0, i32 0
call void @llvm.memcpy.p0.p0.i64(ptr align 1 %10, ptr align 1 %5, i64 1, i1 false)
%18 = getelementptr inbounds %"class.std::(anonymous namespace)::tuple", ptr %10, i32 0, i32 0
%19 = getelementptr inbounds %"struct.std::(anonymous namespace)::__tuple_impl", ptr %18, i32 0, i32 0
%20 = load i8, ptr %19, align 1
%21 = call i64 @_ZNSt12_GLOBAL__N_115make_from_tupleI1ANS_5tupleIJEEEEET_T0_(i8 %20)
store i64 %21, ptr %9, align 4
%22 = load i64, ptr %9, align 4
call void (ptr, ...) @_ZNSt12_GLOBAL__N_113__movable_boxI1AEC2Ez(ptr noundef nonnull align 4 dereferenceable(8) %17, i64 %22)
call void @llvm.memcpy.p0.p0.i64(ptr align 1 %11, ptr align 1 %6, i64 1, i1 false)
%23 = getelementptr inbounds %"class.std::(anonymous namespace)::tuple.0", ptr %11, i32 0, i32 0
%24 = getelementptr inbounds %"struct.std::(anonymous namespace)::__tuple_impl.1", ptr %23, i32 0, i32 0
%25 = load i8, ptr %24, align 1
%26 = call i8 @_ZNSt12_GLOBAL__N_115make_from_tupleINS_22unreachable_sentinel_tENS_5tupleIJS1_EEEEET_T0_(i8 %25)
%27 = getelementptr inbounds %"struct.std::(anonymous namespace)::unreachable_sentinel_t", ptr %16, i32 0, i32 0
store i8 %26, ptr %27, align 4
ret void
} It's not very obvious until after stepping through the code back and forth a dozen times in a debugger, but... The user data payload gets written into
Here we get another pointer When we were lucky before, this byte that was written here just by luck happened to be equal to the first byte written to I.e., this is a case of "How on earth did this thing work in the first place?". This boils down to a much more concrete Clang level question: Why doesn't mingw get the same treatment for empty classes like linux targets get here? That one byte dummy struct seem to be accounted for in some places, but seem to alias other valid data elsewhere, and writes to it clobber the desired payload data. |
Adding more observations after a brief discussion with @AaronBallman on irc. It does seem to be related to the So it is likely that some code relating to However the weird discrepancy in |
As far as I can tell, lowering The part that specifically results in weirdness, is when we try to construct a value of the empty struct type: it ends up storing the returned i8 to memory, in a location that doesn't actually have any backing storage.
|
Thanks, that's a great reduction of the issue! I wouldn't have gotten it reduced to that myself. Regarding mimicing weirdness that MSVC does; that's a good reason indeed. Also, as for Funnily enough, MSVC does seem to behave quite weirdly/inconsistently wrt this specific case as well. With your example above, compiled with
I.e. it takes the dummy one byte return from the If we, however, add
In this case, it suddenly writes the dummy one byte return value into the
Now suddenly, the dummy byte is written into the start of |
I haven't found anything really relevant for the x86_64 special case for returning empty classes yet... I've tried to grep around in tests to see if there's anything specific for that (if that could lead me back to a commit where the implementation was added), but it seems to have been in place for a very long time. At least according to Compiler Explorer, Clang 3.1 seems to have had this behaviour already. I did find 83a1258 which seems remotely related, but not quite. I noted that Clang 3.8 seems to have the same behaviour for struct S {};
S f() { return S(); } on i386 as for x86_64, while Clang 3.9 has got the linux-like |
I bisected this behavior change down to 380b224. |
Actually, it's apparently not entirely specific for Windows/x86_64 - the same happens for e.g. This tweak avoids the issue by ignoring empty structs in return values: diff --git a/clang/lib/CodeGen/TargetInfo.cpp b/clang/lib/CodeGen/TargetInfo.cpp
index 5bbf78a9b69e..5c435a5d001f 100644
--- a/clang/lib/CodeGen/TargetInfo.cpp
+++ b/clang/lib/CodeGen/TargetInfo.cpp
@@ -4292,6 +4292,10 @@ ABIArgInfo WinX86_64ABIInfo::classify(QualType Ty, unsigned &FreeSSERegs,
if (Ty->isVoidType())
return ABIArgInfo::getIgnore();
+ // Ignore empty structs/unions.
+ if (IsReturnType && isEmptyRecord(getContext(), Ty, true))
+ return ABIArgInfo::getIgnore();
+
if (const EnumType *EnumTy = Ty->getAs<EnumType>())
Ty = EnumTy->getDecl()->getIntegerType();
I'm not sure if this is to be considered an ABI break or not - for functions that returned an empty struct, we previously returned nonsense data in the The better fix would be to avoid emitting a write at all, when the initialized class member doesn't have any storage (has got the |
I think this diff should fix the issue: diff --git a/clang/lib/CodeGen/CGClass.cpp b/clang/lib/CodeGen/CGClass.cpp
index 0795ea598411..cb637c084f88 100644
--- a/clang/lib/CodeGen/CGClass.cpp
+++ b/clang/lib/CodeGen/CGClass.cpp
@@ -710,6 +710,8 @@ void CodeGenFunction::EmitInitializerForField(FieldDecl *Field, LValue LHS,
getOverlapForFieldInit(Field), AggValueSlot::IsNotZeroed,
// Checks are made by the code that calls constructor.
AggValueSlot::IsSanitizerChecked);
+ if (Field->hasAttr<NoUniqueAddressAttr>())
+ Slot = AggValueSlot::ignored();
EmitAggExpr(Init, Slot);
break;
} The diff in generated LLVM IR looks like this: --- old.ll 2023-08-07 09:52:40.706588049 +0300
+++ new.ll 2023-08-07 09:53:11.022702973 +0300
@@ -12,13 +12,14 @@
define dso_local void @_ZN1ZC2Ev(ptr noundef nonnull align 4 dereferenceable(4) %this) unnamed_addr #0 align 2 {
entry:
%this.addr = alloca ptr, align 8
+ %coerce = alloca %struct.S, align 1
store ptr %this, ptr %this.addr, align 8
%this1 = load ptr, ptr %this.addr, align 8
%x = getelementptr inbounds %struct.Z, ptr %this1, i32 0, i32 0
store i32 111, ptr %x, align 4
%call = call i8 @_Z1fv()
- %coerce.dive = getelementptr inbounds %struct.S, ptr %this1, i32 0, i32 0
- store i8 %call, ptr %coerce.dive, align 4
+ %coerce.dive = getelementptr inbounds %struct.S, ptr %coerce, i32 0, i32 0
+ store i8 %call, ptr %coerce.dive, align 1
ret void
}
There's still a store, but to a safe unused alloca; with optimization, the dead store goes away. I'll try to make a proper patch to upstream this fix within a day or so; CC @crtrott @amy-kwan. |
Woohoo: good work tracking this down. Btw. my tests (and if I understand your findings correct, your diagnosis) indicate this has been an issue essentially forever right? Does that make this a candidate for backport, at least to a 16.0.7? |
@llvm/issue-subscribers-clang-codegen |
We're not currently planning any additional releases of 16.x, and because it isn't a regression I'm not certain if it should be backported to 17.x (I'd have to see the final patch and any fallout from it to know better on that, though). |
It does make a few tests in libcxx 17.x fail though. We would need to identify exactly which platforms are affected and disable those tests there I guess? I did disable the mdspan test on MinGW since we test that in CI, Power is not tested in CI, so currently on Power the libcxx tests will fail with the 17.x. |
The proper patch is up now at https://reviews.llvm.org/D157332. IMO the fix looks quite targeted and safe enough IMO (most platforms already treat it as a void and should be entirely unaffected, and the fix seems fairly safe to me). But anyway let's review it properly, then hopefully keep it in main for a few days, and then raise the question of whether we can backport it. It's not a regression, but it seems to be a longstanding flaw for a C++ code pattern that doesn't seem to have been common earlier, but seems to be becoming more common in recently added libccxx code. |
Reopening, as I'd like to have the fix in 17.x. |
/branch llvm/llvm-project-release-prs/issue64253 |
An empty struct is handled as a struct with a dummy i8, on all targets. Most targets treat an empty struct return value as essentially void - but some don't. (Currently, at least x86_64-windows-* and powerpc64le-* don't treat it as void.) When intializing a struct with such a no_unique_address member, make sure we don't write the dummy i8 into the struct where there's no space allocated for it. Previously it would clobber the actual valid data of the struct. Fixes llvm/llvm-project#64253, and possibly llvm/llvm-project#64077 and llvm/llvm-project#64427 as well. We should omit the store for any empty record (not only ones declared with no_unique_address); we can have a situation where a class doesn't have the no_unique_address attribute, but is embedded in an outer struct with the no_unique_address attribute - like this: struct S {}; S f(); struct S2 : public S { S2();}; S2::S2() : S(f()) {} struct S3 { int x; [[no_unique_address]] S2 y; S3(); }; S3::S3() : x(1), y() {} Here, the problematic store (which this patch omits) is in the constructor of S2. In the case of S3, S2 has no valid storage and aliases x - thus the constructor of S2 should omit the dummy store. Differential Revision: https://reviews.llvm.org/D157332 (cherry picked from commit d60c3d0)
/pull-request llvm/llvm-project-release-prs#589 |
FYI, I reported the same issue as a bug in MSVC (when using |
An empty struct is handled as a struct with a dummy i8, on all targets. Most targets treat an empty struct return value as essentially void - but some don't. (Currently, at least x86_64-windows-* and powerpc64le-* don't treat it as void.) When intializing a struct with such a no_unique_address member, make sure we don't write the dummy i8 into the struct where there's no space allocated for it. Previously it would clobber the actual valid data of the struct. Fixes llvm/llvm-project#64253, and possibly llvm/llvm-project#64077 and llvm/llvm-project#64427 as well. We should omit the store for any empty record (not only ones declared with no_unique_address); we can have a situation where a class doesn't have the no_unique_address attribute, but is embedded in an outer struct with the no_unique_address attribute - like this: struct S {}; S f(); struct S2 : public S { S2();}; S2::S2() : S(f()) {} struct S3 { int x; [[no_unique_address]] S2 y; S3(); }; S3::S3() : x(1), y() {} Here, the problematic store (which this patch omits) is in the constructor of S2. In the case of S3, S2 has no valid storage and aliases x - thus the constructor of S2 should omit the dummy store. Differential Revision: https://reviews.llvm.org/D157332 (cherry picked from commit d60c3d0)
An empty struct is handled as a struct with a dummy i8, on all targets. Most targets treat an empty struct return value as essentially void - but some don't. (Currently, at least x86_64-windows-* and powerpc64le-* don't treat it as void.) When intializing a struct with such a no_unique_address member, make sure we don't write the dummy i8 into the struct where there's no space allocated for it. Previously it would clobber the actual valid data of the struct. Fixes llvm#64253, and possibly llvm#64077 and llvm#64427 as well. We should omit the store for any empty record (not only ones declared with no_unique_address); we can have a situation where a class doesn't have the no_unique_address attribute, but is embedded in an outer struct with the no_unique_address attribute - like this: struct S {}; S f(); struct S2 : public S { S2();}; S2::S2() : S(f()) {} struct S3 { int x; [[no_unique_address]] S2 y; S3(); }; S3::S3() : x(1), y() {} Here, the problematic store (which this patch omits) is in the constructor of S2. In the case of S3, S2 has no valid storage and aliases x - thus the constructor of S2 should omit the dummy store. Differential Revision: https://reviews.llvm.org/D157332
An empty struct is handled as a struct with a dummy i8, on all targets. Most targets treat an empty struct return value as essentially void - but some don't. (Currently, at least x86_64-windows-* and powerpc64le-* don't treat it as void.) When intializing a struct with such a no_unique_address member, make sure we don't write the dummy i8 into the struct where there's no space allocated for it. Previously it would clobber the actual valid data of the struct. Fixes llvm#64253, and possibly llvm#64077 and llvm#64427 as well. We should omit the store for any empty record (not only ones declared with no_unique_address); we can have a situation where a class doesn't have the no_unique_address attribute, but is embedded in an outer struct with the no_unique_address attribute - like this: struct S {}; S f(); struct S2 : public S { S2();}; S2::S2() : S(f()) {} struct S3 { int x; [[no_unique_address]] S2 y; S3(); }; S3::S3() : x(1), y() {} Here, the problematic store (which this patch omits) is in the constructor of S2. In the case of S3, S2 has no valid storage and aliases x - thus the constructor of S2 should omit the dummy store. Differential Revision: https://reviews.llvm.org/D157332
An empty struct is handled as a struct with a dummy i8, on all targets. Most targets treat an empty struct return value as essentially void - but some don't. (Currently, at least x86_64-windows-* and powerpc64le-* don't treat it as void.) When intializing a struct with such a no_unique_address member, make sure we don't write the dummy i8 into the struct where there's no space allocated for it. Previously it would clobber the actual valid data of the struct. Fixes llvm#64253, and possibly llvm#64077 and llvm#64427 as well. We should omit the store for any empty record (not only ones declared with no_unique_address); we can have a situation where a class doesn't have the no_unique_address attribute, but is embedded in an outer struct with the no_unique_address attribute - like this: struct S {}; S f(); struct S2 : public S { S2();}; S2::S2() : S(f()) {} struct S3 { int x; [[no_unique_address]] S2 y; S3(); }; S3::S3() : x(1), y() {} Here, the problematic store (which this patch omits) is in the constructor of S2. In the case of S3, S2 has no valid storage and aliases x - thus the constructor of S2 should omit the dummy store. Differential Revision: https://reviews.llvm.org/D157332
An empty struct is handled as a struct with a dummy i8, on all targets. Most targets treat an empty struct return value as essentially void - but some don't. (Currently, at least x86_64-windows-* and powerpc64le-* don't treat it as void.) When intializing a struct with such a no_unique_address member, make sure we don't write the dummy i8 into the struct where there's no space allocated for it. Previously it would clobber the actual valid data of the struct. Fixes llvm#64253, and possibly llvm#64077 and llvm#64427 as well. We should omit the store for any empty record (not only ones declared with no_unique_address); we can have a situation where a class doesn't have the no_unique_address attribute, but is embedded in an outer struct with the no_unique_address attribute - like this: struct S {}; S f(); struct S2 : public S { S2();}; S2::S2() : S(f()) {} struct S3 { int x; [[no_unique_address]] S2 y; S3(); }; S3::S3() : x(1), y() {} Here, the problematic store (which this patch omits) is in the constructor of S2. In the case of S3, S2 has no valid storage and aliases x - thus the constructor of S2 should omit the dummy store. Differential Revision: https://reviews.llvm.org/D157332
An empty struct is handled as a struct with a dummy i8, on all targets. Most targets treat an empty struct return value as essentially void - but some don't. (Currently, at least x86_64-windows-* and powerpc64le-* don't treat it as void.) When intializing a struct with such a no_unique_address member, make sure we don't write the dummy i8 into the struct where there's no space allocated for it. Previously it would clobber the actual valid data of the struct. Fixes llvm#64253, and possibly llvm#64077 and llvm#64427 as well. We should omit the store for any empty record (not only ones declared with no_unique_address); we can have a situation where a class doesn't have the no_unique_address attribute, but is embedded in an outer struct with the no_unique_address attribute - like this: struct S {}; S f(); struct S2 : public S { S2();}; S2::S2() : S(f()) {} struct S3 { int x; [[no_unique_address]] S2 y; S3(); }; S3::S3() : x(1), y() {} Here, the problematic store (which this patch omits) is in the constructor of S2. In the case of S3, S2 has no valid storage and aliases x - thus the constructor of S2 should omit the dummy store. Differential Revision: https://reviews.llvm.org/D157332
An empty struct is handled as a struct with a dummy i8, on all targets. Most targets treat an empty struct return value as essentially void - but some don't. (Currently, at least x86_64-windows-* and powerpc64le-* don't treat it as void.) When intializing a struct with such a no_unique_address member, make sure we don't write the dummy i8 into the struct where there's no space allocated for it. Previously it would clobber the actual valid data of the struct. Fixes llvm#64253, and possibly llvm#64077 and llvm#64427 as well. We should omit the store for any empty record (not only ones declared with no_unique_address); we can have a situation where a class doesn't have the no_unique_address attribute, but is embedded in an outer struct with the no_unique_address attribute - like this: struct S {}; S f(); struct S2 : public S { S2();}; S2::S2() : S(f()) {} struct S3 { int x; [[no_unique_address]] S2 y; S3(); }; S3::S3() : x(1), y() {} Here, the problematic store (which this patch omits) is in the constructor of S2. In the case of S3, S2 has no valid storage and aliases x - thus the constructor of S2 should omit the dummy store. Differential Revision: https://reviews.llvm.org/D157332
An empty struct is handled as a struct with a dummy i8, on all targets. Most targets treat an empty struct return value as essentially void - but some don't. (Currently, at least x86_64-windows-* and powerpc64le-* don't treat it as void.) When intializing a struct with such a no_unique_address member, make sure we don't write the dummy i8 into the struct where there's no space allocated for it. Previously it would clobber the actual valid data of the struct. Fixes llvm#64253, and possibly llvm#64077 and llvm#64427 as well. We should omit the store for any empty record (not only ones declared with no_unique_address); we can have a situation where a class doesn't have the no_unique_address attribute, but is embedded in an outer struct with the no_unique_address attribute - like this: struct S {}; S f(); struct S2 : public S { S2();}; S2::S2() : S(f()) {} struct S3 { int x; [[no_unique_address]] S2 y; S3(); }; S3::S3() : x(1), y() {} Here, the problematic store (which this patch omits) is in the constructor of S2. In the case of S3, S2 has no valid storage and aliases x - thus the constructor of S2 should omit the dummy store. Differential Revision: https://reviews.llvm.org/D157332
Since c65b4d6, https://reviews.llvm.org/D135462, some C++ code is miscompiled for the mingw/x86_64 target.
This breaks the libcxx testcase
libcxx/test/std/ranges/range.factories/range.repeat.view/ctor.piecewise.pass.cpp
.Noticing the issue is somewhat tricky since this testcase is rather new; the libcxx CI uses a prebuilt release with Clang 16 (where the testcase passes correctly). At the point of the regression, the testcase didn't exist yet (so a configuration that tests libcxx with latest git Clang didn't catch it immediately until long after the miscompilation was introduced); the test (and corresponding libc++ implementation) was added only later.
The issue can be reproduced with a reduced testcase like this:
misopt.zip
(I'll rerun the reduction to try to retain the libc++ header names to keep it slightly more readable.)
If compiled with
clang -target x86_64-w64-mingw32 -std=c++2b misopt.cpp -o misopt.exe
, the resulting binary outputsrv[0].x = 0 - incorrect
. If optimization is added with e.g.-O2
, it correctly outputsrv[0].x = 111 - ok
instead. If compiled with Clang 16, it runs correctly in both cases.CC @asavonic
The text was updated successfully, but these errors were encountered: