-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[TBAA] Handle bitfields when generating !tbaa.struct metadata. #82922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
At the moment, clang generates what I believe are incorrect !tbaa.struct fields for named bitfields. At the moment, the base type size is used for named bifields (e.g. sizeof(int)) instead of the bifield width per field. This results in overalpping fields in !tbaa.struct metadata. This causes incorrect results when extracting individual copied fields from !tbaa.struct as in added in dc85719. This patch fixes that by skipping all bitfields, not only unnamed ones (note that CollectFields has a TODO to support bitfields). As bitfields specify their widths in bits, while !tbaa metadata uses bytes for sizes and offsets, I don't think we would be able to generate correct metadata for them in general. If this understanding is correct, I can also extend the verifier to check that !tbaa.struct fields aren't overlapping. Fixes llvm#82586
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-codegen Author: Florian Hahn (fhahn) ChangesAt the moment, clang generates what I believe are incorrect !tbaa.struct fields for named bitfields. At the moment, the base type size is used for named bifields (e.g. sizeof(int)) instead of the bifield width per field. This results in overalpping fields in !tbaa.struct metadata. This causes incorrect results when extracting individual copied fields from !tbaa.struct as in added in dc85719. This patch fixes that by skipping all bitfields, not only unnamed ones (note that CollectFields has a TODO to support bitfields). As bitfields specify their widths in bits, while !tbaa metadata uses bytes for sizes and offsets, I don't think we would be able to generate correct metadata for them in general. If this understanding is correct, I can also extend the verifier to check that !tbaa.struct fields aren't overlapping. Fixes #82586 Full diff: https://github.com/llvm/llvm-project/pull/82922.diff 2 Files Affected:
diff --git a/clang/lib/CodeGen/CodeGenTBAA.cpp b/clang/lib/CodeGen/CodeGenTBAA.cpp
index dc288bc3f6157a..43a1aee3d73823 100644
--- a/clang/lib/CodeGen/CodeGenTBAA.cpp
+++ b/clang/lib/CodeGen/CodeGenTBAA.cpp
@@ -298,7 +298,7 @@ CodeGenTBAA::CollectFields(uint64_t BaseOffset,
unsigned idx = 0;
for (RecordDecl::field_iterator i = RD->field_begin(),
e = RD->field_end(); i != e; ++i, ++idx) {
- if ((*i)->isZeroSize(Context) || (*i)->isUnnamedBitfield())
+ if ((*i)->isZeroSize(Context) || (*i)->isBitField())
continue;
uint64_t Offset = BaseOffset +
Layout.getFieldOffset(idx) / Context.getCharWidth();
diff --git a/clang/test/CodeGen/tbaa-struct.cpp b/clang/test/CodeGen/tbaa-struct.cpp
index ff5521fcf3f604..17c9d6bf6a7260 100644
--- a/clang/test/CodeGen/tbaa-struct.cpp
+++ b/clang/test/CodeGen/tbaa-struct.cpp
@@ -130,7 +130,7 @@ void copy8(NamedBitfields *a1, NamedBitfields *a2) {
// CHECK-OLD: [[TS3]] = !{i64 0, i64 8, !{{.*}}, i64 0, i64 2, !{{.*}}, i64 4, i64 8, !{{.*}}}
// CHECK-OLD: [[TS4]] = !{i64 0, i64 1, [[TAG_CHAR]], i64 1, i64 1, [[TAG_CHAR]], i64 2, i64 1, [[TAG_CHAR]]}
// CHECK-OLD: [[TS5]] = !{i64 0, i64 1, [[TAG_CHAR]], i64 4, i64 1, [[TAG_CHAR]], i64 5, i64 1, [[TAG_CHAR]]}
-// CHECK-OLD: [[TS6]] = !{i64 0, i64 4, [[TAG_INT]], i64 1, i64 4, [[TAG_INT]], i64 2, i64 1, [[TAG_CHAR]], i64 8, i64 8, [[TAG_DOUBLE:!.+]]}
+// CHECK-OLD: [[TS6]] = !{i64 2, i64 1, [[TAG_CHAR]], i64 8, i64 8, [[TAG_DOUBLE:!.+]]}
// CHECK-OLD: [[TAG_DOUBLE]] = !{[[DOUBLE:!.+]], [[DOUBLE]], i64 0}
// CHECK-OLD [[DOUBLE]] = !{!"double", [[CHAR]], i64 0}
|
This seems like it messes up the metadata in a different way: we have no representation of the bitfield at all, so it looks like it's padding. I think we want to treat multiple adjacent bitfields as a single "field". So NamedBitfields from the testcase would have three fields in the LLVM TBAA data: one "field" containing both bitfields, followed by a field for the char, followed by a field for the double. You should be able to use CGBitFieldInfo for this, I think? |
✅ With the latest revision this PR passed the C/C++ code formatter. |
Thanks, updated the patch as suggested. |
clang/lib/CodeGen/CodeGenTBAA.cpp
Outdated
if ((*i)->isBitField() && !(*i)->isUnnamedBitfield()) { | ||
unsigned CurrentBitFieldSize = 0; | ||
unsigned CurrentBitFieldOffset = CGRL.getBitFieldInfo(*i).Offset; | ||
while (i != e && (*i)->isBitField() && !(*i)->isUnnamedBitfield()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this handles non-zero-length unnamed bitfields correctly; they count as part of a series of bitfields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, test showing that added in 54cff50, thanks!
clang/lib/CodeGen/CodeGenTBAA.cpp
Outdated
const CGBitFieldInfo &Info = CGRL.getBitFieldInfo(*i); | ||
if (CurrentBitFieldSize + CurrentBitFieldOffset != Info.Offset) | ||
break; | ||
CurrentBitFieldSize += Info.Size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we pull the initial offset and total length of the whole run bitfields out of the CGBitFieldInfo, instead of trying to add up the sizes of the individual fields? I think it should be StorageOffset/StorageSize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, updated the patch to add a field for the first bitfield in the run using its StorageSize, and skip bitfields where the offset in the run isn't 0 (Offset
).
Extra tests with unnamed bitfields for #82922.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whitespace is weird in a few places; otherwise looks fine.
Argh yes, should be fixed in the latest version. Still need to figure out how to make sure all changes are formatted once the branch contains merges (before the format checker complains and shows the commits) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks! |
Additional test for llvm#82922. (cherry-picked from f290c00)
Extra tests with unnamed bitfields for llvm#82922. (cherry-picked from 54cff50)
…82922) At the moment, clang generates what I believe are incorrect !tbaa.struct fields for named bitfields. At the moment, the base type size is used for named bifields (e.g. sizeof(int)) instead of the bifield width per field. This results in overalpping fields in !tbaa.struct metadata. This causes incorrect results when extracting individual copied fields from !tbaa.struct as in added in dc85719. This patch fixes that by skipping by combining adjacent bitfields in fields with correct sizes. Fixes llvm#82586 (cherry-picked from d2a9df2)
At the moment, clang generates what I believe are incorrect !tbaa.struct fields for named bitfields. At the moment, the base type size is used for named bifields (e.g. sizeof(int)) instead of the bifield width per field. This results in overalpping fields in !tbaa.struct metadata.
This causes incorrect results when extracting individual copied fields from !tbaa.struct as in added in dc85719.
This patch fixes that by skipping by combining adjacent named bitfields in fields with correct size.
If this understanding is correct, I can also extend the verifier to check that !tbaa.struct fields aren't overlapping.
Fixes #82586