-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[Clang] Make the result type of sizeof/pointer subtraction/size_t lit… #136542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…erals be typedefs instead of built-in types Includeing the results of `sizeof`, `sizeof...`, `__datasizeof`, `__alignof`, `_Alignof`, `alignof`, `_Countof`, `size_t` literals, and signed `size_t` literals, as well as the results of pointer-pointer subtraction. It does not affect any program output except for debugging information. The goal is to enable clang and downstream tools such as clangd and clang-tidy to provide more portable hints and diagnostics.
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-clang Author: YexuanXiao (YexuanXiao) Changes…erals be typedefs instead of built-in types Includeing the results of Full diff: https://github.com/llvm/llvm-project/pull/136542.diff 5 Files Affected:
diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h
index 3c78833a3f069..0c133d45d3f5e 100644
--- a/clang/include/clang/AST/ASTContext.h
+++ b/clang/include/clang/AST/ASTContext.h
@@ -2442,6 +2442,10 @@ class ASTContext : public RefCountedBase<ASTContext> {
QualType GetBuiltinType(unsigned ID, GetBuiltinTypeError &Error,
unsigned *IntegerConstantArgs = nullptr) const;
+ QualType getCGlobalCXXStdNSTypedef(const NamespaceDecl *StdNS,
+ StringRef DefName,
+ QualType FallBack = {}) const;
+
/// Types and expressions required to build C++2a three-way comparisons
/// using operator<=>, including the values return by builtin <=> operators.
ComparisonCategories CompCategories;
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 2836d68b05ff6..aa8ce0078d4d3 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -12556,6 +12556,35 @@ QualType ASTContext::GetBuiltinType(unsigned Id,
return getFunctionType(ResType, ArgTypes, EPI);
}
+QualType ASTContext::getCGlobalCXXStdNSTypedef(const NamespaceDecl *StdNS,
+ StringRef DefName,
+ QualType FallBack) const {
+ DeclContextLookupResult Lookup;
+ if (getLangOpts().C99) {
+ Lookup = getTranslationUnitDecl()->lookup(&Idents.get(DefName));
+ } else if (getLangOpts().CPlusPlus) {
+ if (StdNS == nullptr) {
+ auto LookupStdNS = getTranslationUnitDecl()->lookup(&Idents.get("std"));
+ if (!LookupStdNS.empty()) {
+ StdNS = dyn_cast<NamespaceDecl>(LookupStdNS.front());
+ }
+ }
+ if (StdNS) {
+ Lookup = StdNS->lookup(&Idents.get(DefName));
+ } else {
+ Lookup = getTranslationUnitDecl()->lookup(&Idents.get(DefName));
+ }
+ }
+ if (!Lookup.empty()) {
+ if (auto *TD = dyn_cast<TypedefNameDecl>(Lookup.front())) {
+ if (auto Result = getTypeDeclType(TD); !Result.isNull()) {
+ return Result;
+ }
+ }
+ }
+ return FallBack;
+}
+
static GVALinkage basicGVALinkageForFunction(const ASTContext &Context,
const FunctionDecl *FD) {
if (!FD->isExternallyVisible())
diff --git a/clang/lib/AST/ComparisonCategories.cpp b/clang/lib/AST/ComparisonCategories.cpp
index 28244104d6636..46dcd6ac4261d 100644
--- a/clang/lib/AST/ComparisonCategories.cpp
+++ b/clang/lib/AST/ComparisonCategories.cpp
@@ -87,37 +87,17 @@ ComparisonCategoryInfo::ValueInfo *ComparisonCategoryInfo::lookupValueInfo(
return &Objects.back();
}
-static const NamespaceDecl *lookupStdNamespace(const ASTContext &Ctx,
- NamespaceDecl *&StdNS) {
- if (!StdNS) {
- DeclContextLookupResult Lookup =
- Ctx.getTranslationUnitDecl()->lookup(&Ctx.Idents.get("std"));
- if (!Lookup.empty())
- StdNS = dyn_cast<NamespaceDecl>(Lookup.front());
- }
- return StdNS;
-}
-
-static const CXXRecordDecl *lookupCXXRecordDecl(const ASTContext &Ctx,
- const NamespaceDecl *StdNS,
- ComparisonCategoryType Kind) {
- StringRef Name = ComparisonCategories::getCategoryString(Kind);
- DeclContextLookupResult Lookup = StdNS->lookup(&Ctx.Idents.get(Name));
- if (!Lookup.empty())
- if (const CXXRecordDecl *RD = dyn_cast<CXXRecordDecl>(Lookup.front()))
- return RD;
- return nullptr;
-}
-
const ComparisonCategoryInfo *
ComparisonCategories::lookupInfo(ComparisonCategoryType Kind) const {
auto It = Data.find(static_cast<char>(Kind));
if (It != Data.end())
return &It->second;
-
- if (const NamespaceDecl *NS = lookupStdNamespace(Ctx, StdNS))
- if (const CXXRecordDecl *RD = lookupCXXRecordDecl(Ctx, NS, Kind))
+ if (auto QT = Ctx.getCGlobalCXXStdNSTypedef(
+ nullptr, ComparisonCategories::getCategoryString(Kind));
+ !QT.isNull()) {
+ if (const auto *RD = QT->getAsCXXRecordDecl())
return &Data.try_emplace((char)Kind, Ctx, RD, Kind).first->second;
+ }
return nullptr;
}
diff --git a/clang/lib/AST/ExprCXX.cpp b/clang/lib/AST/ExprCXX.cpp
index 169f11b611066..306ddcb9f491a 100644
--- a/clang/lib/AST/ExprCXX.cpp
+++ b/clang/lib/AST/ExprCXX.cpp
@@ -1700,8 +1700,10 @@ SizeOfPackExpr *SizeOfPackExpr::Create(ASTContext &Context,
ArrayRef<TemplateArgument> PartialArgs) {
void *Storage =
Context.Allocate(totalSizeToAlloc<TemplateArgument>(PartialArgs.size()));
- return new (Storage) SizeOfPackExpr(Context.getSizeType(), OperatorLoc, Pack,
- PackLoc, RParenLoc, Length, PartialArgs);
+ return new (Storage) SizeOfPackExpr(
+ Context.getCGlobalCXXStdNSTypedef(nullptr, "size_t",
+ Context.getSizeType()),
+ OperatorLoc, Pack, PackLoc, RParenLoc, Length, PartialArgs);
}
SizeOfPackExpr *SizeOfPackExpr::CreateDeserialized(ASTContext &Context,
diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index 01a021443c94f..d07be9f117957 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -4026,10 +4026,20 @@ ExprResult Sema::ActOnNumericConstant(const Token &Tok, Scope *UDLScope) {
// Does it fit in size_t?
if (ResultVal.isIntN(SizeTSize)) {
// Does it fit in ssize_t?
- if (!Literal.isUnsigned && ResultVal[SizeTSize - 1] == 0)
- Ty = Context.getSignedSizeType();
- else if (AllowUnsigned)
- Ty = Context.getSizeType();
+ if (!Literal.isUnsigned && ResultVal[SizeTSize - 1] == 0) {
+ auto SignedSize = Context.getSignedSizeType();
+ if (auto PtrDiff = Context.getCGlobalCXXStdNSTypedef(
+ getStdNamespace(), "ptrdiff_t");
+ Context.hasSameType(PtrDiff, SignedSize))
+ Ty = PtrDiff;
+ else if (auto SSize = Context.getCGlobalCXXStdNSTypedef(
+ getStdNamespace(), "ssize_t");
+ Context.hasSameType(SSize, SignedSize))
+ Ty = SSize;
+ } else if (AllowUnsigned) {
+ Ty = Context.getCGlobalCXXStdNSTypedef(getStdNamespace(), "size_t",
+ Context.getSizeType());
+ }
Width = SizeTSize;
}
}
@@ -4702,7 +4712,10 @@ ExprResult Sema::CreateUnaryExprOrTypeTraitExpr(TypeSourceInfo *TInfo,
// C99 6.5.3.4p4: the type (an unsigned integer type) is size_t.
return new (Context) UnaryExprOrTypeTraitExpr(
- ExprKind, TInfo, Context.getSizeType(), OpLoc, R.getEnd());
+ ExprKind, TInfo,
+ Context.getCGlobalCXXStdNSTypedef(getStdNamespace(), "size_t",
+ Context.getSizeType()),
+ OpLoc, R.getEnd());
}
ExprResult
@@ -4745,7 +4758,10 @@ Sema::CreateUnaryExprOrTypeTraitExpr(Expr *E, SourceLocation OpLoc,
// C99 6.5.3.4p4: the type (an unsigned integer type) is size_t.
return new (Context) UnaryExprOrTypeTraitExpr(
- ExprKind, E, Context.getSizeType(), OpLoc, E->getSourceRange().getEnd());
+ ExprKind, E,
+ Context.getCGlobalCXXStdNSTypedef(getStdNamespace(), "size_t",
+ Context.getSizeType()),
+ OpLoc, E->getSourceRange().getEnd());
}
ExprResult
@@ -11353,8 +11369,10 @@ QualType Sema::CheckSubtractionOperands(ExprResult &LHS, ExprResult &RHS,
}
}
- if (CompLHSTy) *CompLHSTy = LHS.get()->getType();
- return Context.getPointerDiffType();
+ if (CompLHSTy)
+ *CompLHSTy = LHS.get()->getType();
+ return Context.getCGlobalCXXStdNSTypedef(getStdNamespace(), "ptrdiff_t",
+ Context.getPointerDiffType());
}
}
|
clang/lib/Sema/SemaExpr.cpp
Outdated
Context.hasSameType(SSize, SignedSize)) | ||
Ty = SSize; | ||
} else if (AllowUnsigned) { | ||
Ty = Context.getCGlobalCXXStdNSTypedef(getStdNamespace(), "size_t", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a test file to verify that size_t
is shown instead of unsigned long
/unsigned long long
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking into it.
…hat the type is not empty
fallback and try to fix tests
clang/lib/Sema/SemaExpr.cpp
Outdated
Context.getCGlobalCXXStdNSTypedef(getStdNamespace(), "size_t", | ||
Context.getSizeType()), | ||
OpLoc, E->getSourceRange().getEnd()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioning getStdNamespace
for C is really not great. Couldn't we merge getCGlobalCXXStdNSTypedef into Context.getSizeType?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it appropriate to change it to getLangOpts().CPlusPlus ? getStdNamespace() : nullptr
? I’m concerned that moving it into getSizeType
might introduce unintended effects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this!
I'd like to better understand the need for the changes because I have a few concerns. One concern is about compile time performance. But also, this means downstream consumers of the AST are going to have to react because they used to be able to look for a size_t
node directly and now they have to resolve a qualified type instead. This may be acceptable, but it seems disruptive too.
Also, there should be more test coverage for the changes showing that we actually do get the types correct in all the various circumstances.
The current inlay hint of clangd is |
Yes, but this doesn't exactly accomplish that. In C, you'll still get the underlying integer type unless there happens to be a typedef we can find, right? So you can spot a difference between:
right? What it seems like you're really after is making CC @zygoloid @rjmccall @erichkeane for additional opinions on this. |
I think you're on to something here actually. We should do something like we do with the It does NOT make sense to allow the fallback to have a different textual type based on the existence of the typedef. |
TO ADD: see uses of |
Actually, it might be even worse; I think it's valid for a user to define a typedef for
where it's unclear whether that static assertion should pass or fail. |
I don't have a problem to this, but this approach is more complex and requires more professional work. |
Okay, the typedef is valid (not using a reserved identifier), and the assertion should fail because
I don't think we can do that, at least for C, because that means this code would then be accepted when it shouldn't be, right?
|
Yeah, I am not sure we can do this now then. It seems odd to allow calls to builtins to change their return 'type' (or at least, return sugar) based on when they are included. Even if we did, we would have to make sure it actually WAS SIZE_TYPE and not a different typedef. So I guess we're back to: are we ok with this?
|
The current implementation retains the possibility for users to write |
Is there a reason to do so? Intermediate sugar types don't generally have much of an effect on diagnostics, which is the motivation for this work. |
When |
That is not possible, and this is exactly what I'm worried about with all this discussion about making |
Er, I feel like I must be missing something, because I think we can do this. |
The specification of Moreover, we're not just implementing the C standard, we're also implementing the platform ABI, which specifies exactly which type I really don't think this is a fruitful line of investigation. I'd also note that, while in C there are some allowances for compatible types, in C++ there are not, and given that the problem isn't C-specific, it seems like the solution shouldn't be either. Type sugar is the proper way to model that we want to print a type differently in some circumstances. Let's not invent something else.
There's some motivation, at least in principle. We do some work to preserve common type sugar when merging types, and this might allow us to preserve the type sugar in a case like: const auto a = sizeof(0);
#include <stddef.h>
size_t b = 0;
// Type has common sugar of size_t and typeof(sizeof(0)).
cond ? a : b But I don't think this needs to be a priority. (Also, we could make the type sugar merging logic explicitly handle this case if needed, rather than changing the types that the library typedefs are aliases for.) |
At least, it must ensure that |
Something being unspecified in the standard doesn't mean we're unconstrained. In this case, it is unspecified because it is allowed to vary between platforms, but platforms are required to specify what type Moreover, C++ encodes the underlying type of typedefs into the ABI. Now, C might be different because of how loose the compatible-types rule is. If you want to pursue this just for C, we can talk about it; I don't know that it's a good idea, but we can at least talk about it. |
Is that possible? @rjmccall |
If the type is an extension, we can define its compatibility rules, can't we? We certainly treat I was thinking we'd be doing the same here.
Type sugar may still be a better approach than aliasing the type to the underlying type like I was envisioning. |
With template specialization resugaring, these being typedefs still help somewhat: You have to make a bit of contortion to expose the intermediate type, but I think that's partly due to a different problem, where in diagnostics we don't try to skip over some unhelpful top level typedefs, like vector's |
If The result is that, no, this is not possible. Two types that mangle the same way really need to be the same type. This is the relationship between a type and a sugaring of the type, not the relationship between two different types that are compatible (share a common representation). |
Okay, I see where the disconnect is now. I was using standards terms like "compatible" when what I was really meaning was "alias". e.g., I wasn't suggesting we introduce a distinct, new type. I was suggesting we take the existing types and give them a spelling of But I'm more and more thinking Richard is correct, this is just a fancy form of sugar. |
If the reason to even consider doing this as a new builtin type is due to templates, please don't do that. The better alternative here is to wait for template specialization resugaring to land. |
The main purpose is for these expressions, with templates being just one possible beneficiary. Templates should print user-defined typedefs instead of the underlying types, but this is not within the scope of this PR. |
Sure, but outside of templates, all forms of type sugar are preserved just fine in the current implementation, so there would be no particular reason to pursue a different builtin type. |
Well, the type (Note for example that this would presumably result in an ABI break for clang's
It looks to me like we treat the keyword |
To be precise, the problem here is not really creating a new builtin type, as long as that new builtin type is still canonically the same as the existing builtin type. This could certainly be a strategy to give a new spelling for existing cases like |
We VERY much shouldn't be creating a new type. If we make a I could see us just auto-introducing generating a |
It doesn't need to be a type alias, it just needs to be type sugar to an existing type. A type alias is just one of the forms to achieve that. |
The advantage of a "builtin-type as sugar" approach is that you get the same effect without having to synthesize a TypedefDecl (and the associated polution that currently causes in AST dumps). This might expose existing 'bugs' where a builtin-type node is assumed to be a canonical type, so it could take more effort to upstream. |
If you make it a canonical node, you also automatically don't get an Ie you avoid the aka in |
I don't think you'd WANT to avoid that, I would want a type-alias because what we want is actually something that works exactly LIKE a type alias in every way, except doesn't require declaration. Optimally we'd just generate a YES, of course we could come up with a new type-like-thing that aliases a specific type (at which point, I'd say it isn't really a type, just a type alias with a fancy representation), but not have it be a type alias, but really all that does is save us a node in the AST. IF we were worried about THAT, we could have it lazily constructed, though that buys us little. |
You can still do that with a "builtin-type as sugar" approach, you just have to make it a non-canonical node (ie It's one of the options, and whether we print an aka or not is still a diagnostic policy (we should do it only if it's relevant, but our implementation is currently very simple). |
Yeah, sorry, I feel I accidentally derailed the conversation by talking about new types. :-) Are we converging on the idea of using type sugar? |
The anticipated implementation from the discussion is beyond my capabilities. |
Hey, chiming in here since I've been silently following this thread and I think you've been doing great. The conversation might seem intimidating but once we converge on an approach, the code change should be straightforward. Don't question your own capabilities. |
I fully understand and agree with the previous discussions. I wasn’t intimidated by them—the issue is simply my lack of familiarity with Clang. I have some other lighter ideas that now take higher priority, and this idea won't be lost by closing the PR. I’ll revisit it when the opportunity arises. Thank you for your concern. |
…erals be typedefs instead of built-in types
Includeing the results of
sizeof
,sizeof...
,__datasizeof
,__alignof
,_Alignof
,alignof
,_Countof
,size_t
literals, and signedsize_t
literals, as well as the results of pointer-pointer subtraction. It does not affect any program output except for debugging information. The goal is to enable clang and downstream tools such as clangd and clang-tidy to provide more portable hints and diagnostics.