Skip to content

[BoundsSafety] WIP: Make 'counted_by' work for pointer fields; late parsing for 'counted_by' on decl attr position #87596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 18 additions & 1 deletion clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,8 @@ New Compiler Flags

- ``-fexperimental-late-parse-attributes`` enables an experimental feature to
allow late parsing certain attributes in specific contexts where they would
not normally be late parsed.
not normally be late parsed. Currently this allows late parsing the
`counted_by` attribute in C. See `Attribute Changes in Clang`_.

Deprecated Compiler Flags
-------------------------
Expand Down Expand Up @@ -335,6 +336,22 @@ Attribute Changes in Clang
- Clang now warns that the ``exclude_from_explicit_instantiation`` attribute
is ignored when applied to a local class or a member thereof.

- The ``counted_by`` attribute can now be late parsed in C when
``-fexperimental-late-parse-attributes`` is passed but only when attribute is
used in the declaration attribute position. This allows using the
attribute on existing code where it previously impossible to do so without
re-ordering struct field declarations would break ABI as shown below.

.. code-block:: c

struct BufferTy {
/* Refering to `count` requires late parsing */
char* buffer __counted_by(count);
/* Swapping `buffer` and `count` to avoid late parsing would break ABI */
size_t count;
};


Improvements to Clang's diagnostics
-----------------------------------
- Clang now applies syntax highlighting to the code snippets it
Expand Down
1 change: 1 addition & 0 deletions clang/include/clang/AST/Type.h
Original file line number Diff line number Diff line change
Expand Up @@ -2515,6 +2515,7 @@ class alignas(TypeAlignment) Type : public ExtQualsTypeCommonBase {
bool isRecordType() const;
bool isClassType() const;
bool isStructureType() const;
bool isStructureTypeWithFlexibleArrayMember() const;
bool isObjCBoxableRecordType() const;
bool isInterfaceType() const;
bool isStructureOrClassType() const;
Expand Down
3 changes: 2 additions & 1 deletion clang/include/clang/Basic/Attr.td
Original file line number Diff line number Diff line change
Expand Up @@ -2229,7 +2229,8 @@ def TypeNullUnspecified : TypeAttr {
def CountedBy : DeclOrTypeAttr {
let Spellings = [Clang<"counted_by">];
let Subjects = SubjectList<[Field], ErrorDiag>;
let Args = [ExprArgument<"Count">, IntArgument<"NestedLevel">];
let Args = [ExprArgument<"Count">, IntArgument<"NestedLevel", 1>];
let LateParsed = LateAttrParseExperimentalExt;
let ParseArgumentsAsUnevaluated = 1;
let Documentation = [CountedByDocs];
let LangOpts = [COnly];
Expand Down
15 changes: 13 additions & 2 deletions clang/include/clang/Basic/DiagnosticSemaKinds.td
Original file line number Diff line number Diff line change
Expand Up @@ -6518,8 +6518,10 @@ def warn_superclass_variable_sized_type_not_at_end : Warning<

def err_flexible_array_count_not_in_same_struct : Error<
"'counted_by' field %0 isn't within the same struct as the flexible array">;
def err_counted_by_attr_not_on_flexible_array_member : Error<
"'counted_by' only applies to C99 flexible array members">;
def err_counted_by_attr_not_on_ptr_or_flexible_array_member : Error<
"'counted_by' only applies to pointers or C99 flexible array members">;
def err_counted_by_attr_on_array_not_flexible_array_member : Error<
"'counted_by' on arrays only applies to C99 flexible array members">;
def err_counted_by_attr_refer_to_itself : Error<
"'counted_by' cannot refer to the flexible array member %0">;
def err_counted_by_must_be_in_structure : Error<
Expand All @@ -6534,6 +6536,15 @@ def err_counted_by_attr_refer_to_union : Error<
"'counted_by' argument cannot refer to a union member">;
def note_flexible_array_counted_by_attr_field : Note<
"field %0 declared here">;
def err_counted_by_attr_pointee_unknown_size : Error<
"'counted_by' cannot be applied a pointer with pointee with unknown size "
"because %0 is %select{"
"an incomplete type|" // CountedByInvalidPointeeTypeKind::INCOMPLETE
"a sizeless type|" // CountedByInvalidPointeeTypeKind::SIZELESS
"a function type|" // CountedByInvalidPointeeTypeKind::FUNCTION
// CountedByInvalidPointeeTypeKind::FLEXIBLE_ARRAY_MEMBER
"a struct type with a flexible array member"
"}1">;

let CategoryName = "ARC Semantic Issue" in {

Expand Down
7 changes: 6 additions & 1 deletion clang/include/clang/Parse/Parser.h
Original file line number Diff line number Diff line change
Expand Up @@ -1643,6 +1643,8 @@ class Parser : public CodeCompletionHandler {
bool EnterScope, bool OnDefinition);
void ParseLexedAttribute(LateParsedAttribute &LA,
bool EnterScope, bool OnDefinition);
void ParseLexedCAttribute(LateParsedAttribute &LA,
ParsedAttributes *OutAttrs = nullptr);
void ParseLexedMethodDeclarations(ParsingClass &Class);
void ParseLexedMethodDeclaration(LateParsedMethodDeclaration &LM);
void ParseLexedMethodDefs(ParsingClass &Class);
Expand Down Expand Up @@ -2530,7 +2532,8 @@ class Parser : public CodeCompletionHandler {

void ParseStructDeclaration(
ParsingDeclSpec &DS,
llvm::function_ref<void(ParsingFieldDeclarator &)> FieldsCallback);
llvm::function_ref<Decl *(ParsingFieldDeclarator &)> FieldsCallback,
LateParsedAttrList *LateFieldAttrs = nullptr);

DeclGroupPtrTy ParseTopLevelStmtDecl();

Expand Down Expand Up @@ -3107,6 +3110,8 @@ class Parser : public CodeCompletionHandler {
SourceLocation ScopeLoc,
ParsedAttr::Form Form);

void DistributeCLateParsedAttrs(Decl *Dcl, LateParsedAttrList *LateAttrs);

void ParseBoundsAttribute(IdentifierInfo &AttrName,
SourceLocation AttrNameLoc, ParsedAttributes &Attrs,
IdentifierInfo *ScopeName, SourceLocation ScopeLoc,
Expand Down
3 changes: 2 additions & 1 deletion clang/include/clang/Sema/Sema.h
Original file line number Diff line number Diff line change
Expand Up @@ -11509,7 +11509,8 @@ class Sema final : public SemaBase {
QualType BuildMatrixType(QualType T, Expr *NumRows, Expr *NumColumns,
SourceLocation AttrLoc);

QualType BuildCountAttributedArrayType(QualType WrappedTy, Expr *CountExpr);
QualType BuildCountAttributedArrayOrPointerType(QualType WrappedTy,
Expr *CountExpr);

QualType BuildAddressSpaceAttr(QualType &T, LangAS ASIdx, Expr *AddrSpace,
SourceLocation AttrLoc);
Expand Down
12 changes: 12 additions & 0 deletions clang/lib/AST/Type.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -631,6 +631,18 @@ bool Type::isStructureType() const {
return false;
}

bool Type::isStructureTypeWithFlexibleArrayMember() const {
const auto *RT = getAs<RecordType>();
if (!RT)
return false;
const auto *Decl = RT->getDecl();
if (!Decl->isStruct())
return false;
if (!Decl->hasFlexibleArrayMember())
return false;
return true;
}

bool Type::isObjCBoxableRecordType() const {
if (const auto *RT = getAs<RecordType>())
return RT->getDecl()->hasAttr<ObjCBoxableAttr>();
Expand Down
104 changes: 97 additions & 7 deletions clang/lib/Parse/ParseDecl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3282,6 +3282,19 @@ void Parser::ParseAlignmentSpecifier(ParsedAttributes &Attrs,
}
}

void Parser::DistributeCLateParsedAttrs(Decl *Dcl,
LateParsedAttrList *LateAttrs) {
assert(Dcl);

if (!LateAttrs)
return;

for (auto *LateAttr : *LateAttrs) {
if (LateAttr->Decls.empty())
LateAttr->addDecl(Dcl);
}
}

/// Bounds attributes (e.g., counted_by):
/// AttrName '(' expression ')'
void Parser::ParseBoundsAttribute(IdentifierInfo &AttrName,
Expand Down Expand Up @@ -4815,13 +4828,14 @@ static void DiagnoseCountAttributedTypeInUnnamedAnon(ParsingDeclSpec &DS,
///
void Parser::ParseStructDeclaration(
ParsingDeclSpec &DS,
llvm::function_ref<void(ParsingFieldDeclarator &)> FieldsCallback) {
llvm::function_ref<Decl *(ParsingFieldDeclarator &)> FieldsCallback,
LateParsedAttrList *LateFieldAttrs) {

if (Tok.is(tok::kw___extension__)) {
// __extension__ silences extension warnings in the subexpression.
ExtensionRAIIObject O(Diags); // Use RAII to do this.
ConsumeToken();
return ParseStructDeclaration(DS, FieldsCallback);
return ParseStructDeclaration(DS, FieldsCallback, LateFieldAttrs);
}

// Parse leading attributes.
Expand Down Expand Up @@ -4886,10 +4900,12 @@ void Parser::ParseStructDeclaration(
}

// If attributes exist after the declarator, parse them.
MaybeParseGNUAttributes(DeclaratorInfo.D);
MaybeParseGNUAttributes(DeclaratorInfo.D, LateFieldAttrs);

// We're done with this declarator; invoke the callback.
FieldsCallback(DeclaratorInfo);
Decl *Field = FieldsCallback(DeclaratorInfo);
if (Field)
DistributeCLateParsedAttrs(Field, LateFieldAttrs);

// If we don't have a comma, it is either the end of the list (a ';')
// or an error, bail out.
Expand All @@ -4900,6 +4916,69 @@ void Parser::ParseStructDeclaration(
}
}

/// Finish parsing an attribute for which parsing was delayed.
/// This will be called at the end of parsing a class declaration
/// for each LateParsedAttribute. We consume the saved tokens and
/// create an attribute with the arguments filled in. We add this
/// to the Attribute list for the decl.
void Parser::ParseLexedCAttribute(LateParsedAttribute &LA,
ParsedAttributes *OutAttrs) {
// Create a fake EOF so that attribute parsing won't go off the end of the
// attribute.
Token AttrEnd;
AttrEnd.startToken();
AttrEnd.setKind(tok::eof);
AttrEnd.setLocation(Tok.getLocation());
AttrEnd.setEofData(LA.Toks.data());
LA.Toks.push_back(AttrEnd);

// Append the current token at the end of the new token stream so that it
// doesn't get lost.
LA.Toks.push_back(Tok);
PP.EnterTokenStream(LA.Toks, /*DisableMacroExpansion=*/true,
/*IsReinject=*/true);
// Drop the current token and bring the first cached one. It's the same token
// as when we entered this function.
ConsumeAnyToken(/*ConsumeCodeCompletionTok=*/true);

ParsedAttributes Attrs(AttrFactory);

assert(LA.Decls.size() <= 1 &&
"late field attribute expects to have at most one declaration.");

// Dispatch based on the attribute and parse it
const AttributeCommonInfo::Form ParsedForm = ParsedAttr::Form::GNU();
IdentifierInfo *ScopeName = nullptr;
const ParsedAttr::Kind AttrKind =
ParsedAttr::getParsedKind(&LA.AttrName, /*ScopeName=*/ScopeName,
/*SyntaxUsed=*/ParsedForm.getSyntax());
switch (AttrKind) {
case ParsedAttr::Kind::AT_CountedBy:
ParseBoundsAttribute(LA.AttrName, LA.AttrNameLoc, Attrs,
/*ScopeName=*/ScopeName, SourceLocation(),
/*Form=*/ParsedForm);
break;
default:
llvm_unreachable("Unhandled late parsed attribute");
}

for (auto *D : LA.Decls)
Actions.ActOnFinishDelayedAttribute(getCurScope(), D, Attrs);

// Due to a parsing error, we either went over the cached tokens or
// there are still cached tokens left, so we skip the leftover tokens.
while (Tok.isNot(tok::eof))
ConsumeAnyToken();

// Consume the fake EOF token if it's there
if (Tok.is(tok::eof) && Tok.getEofData() == AttrEnd.getEofData())
ConsumeAnyToken();

if (OutAttrs) {
OutAttrs->takeAllFrom(Attrs);
}
}

/// ParseStructUnionBody
/// struct-contents:
/// struct-declaration-list
Expand All @@ -4923,6 +5002,11 @@ void Parser::ParseStructUnionBody(SourceLocation RecordLoc,
ParseScope StructScope(this, Scope::ClassScope|Scope::DeclScope);
Actions.ActOnTagStartDefinition(getCurScope(), TagDecl);

// `LateAttrParseExperimentalExtOnly=true` requests that only attributes
// marked with `LateAttrParseExperimentalExt` are late parsed.
LateParsedAttrList LateFieldAttrs(/*PSoon=*/false,
/*LateAttrParseExperimentalExtOnly=*/true);

// While we still have something to read, read the declarations in the struct.
while (!tryParseMisplacedModuleImport() && Tok.isNot(tok::r_brace) &&
Tok.isNot(tok::eof)) {
Expand Down Expand Up @@ -4973,18 +5057,19 @@ void Parser::ParseStructUnionBody(SourceLocation RecordLoc,
}

if (!Tok.is(tok::at)) {
auto CFieldCallback = [&](ParsingFieldDeclarator &FD) {
auto CFieldCallback = [&](ParsingFieldDeclarator &FD) -> Decl * {
// Install the declarator into the current TagDecl.
Decl *Field =
Actions.ActOnField(getCurScope(), TagDecl,
FD.D.getDeclSpec().getSourceRange().getBegin(),
FD.D, FD.BitfieldSize);
FD.complete(Field);
return Field;
};

// Parse all the comma separated declarators.
ParsingDeclSpec DS(*this);
ParseStructDeclaration(DS, CFieldCallback);
ParseStructDeclaration(DS, CFieldCallback, &LateFieldAttrs);
} else { // Handle @defs
ConsumeToken();
if (!Tok.isObjCAtKeyword(tok::objc_defs)) {
Expand Down Expand Up @@ -5025,7 +5110,12 @@ void Parser::ParseStructUnionBody(SourceLocation RecordLoc,

ParsedAttributes attrs(AttrFactory);
// If attributes exist after struct contents, parse them.
MaybeParseGNUAttributes(attrs);
MaybeParseGNUAttributes(attrs, &LateFieldAttrs);

// Late parse field attributes if necessary.
assert(!getLangOpts().CPlusPlus);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove the assert? Are we restricting late parsing to C only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@delcypher C++ already has late parsing support for C++ classes and structs and I believe it's handled separately in ParseDeclCXX.cpp. I added the assertion to ensure that the code path isn't taken in C++. We should run clang tests if that breaks anything.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to combine C++'s late parsing with your implementation? Or is there little overlap?

for (auto *LateAttr : LateFieldAttrs)
ParseLexedCAttribute(*LateAttr);

SmallVector<Decl *, 32> FieldDecls(TagDecl->fields());

Expand Down
10 changes: 6 additions & 4 deletions clang/lib/Parse/ParseObjc.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -778,16 +778,16 @@ void Parser::ParseObjCInterfaceDeclList(tok::ObjCKeywordKind contextKey,
}

bool addedToDeclSpec = false;
auto ObjCPropertyCallback = [&](ParsingFieldDeclarator &FD) {
auto ObjCPropertyCallback = [&](ParsingFieldDeclarator &FD) -> Decl * {
if (FD.D.getIdentifier() == nullptr) {
Diag(AtLoc, diag::err_objc_property_requires_field_name)
<< FD.D.getSourceRange();
return;
return nullptr;
}
if (FD.BitfieldSize) {
Diag(AtLoc, diag::err_objc_property_bitfield)
<< FD.D.getSourceRange();
return;
return nullptr;
}

// Map a nullability property attribute to a context-sensitive keyword
Expand Down Expand Up @@ -816,6 +816,7 @@ void Parser::ParseObjCInterfaceDeclList(tok::ObjCKeywordKind contextKey,
MethodImplKind);

FD.complete(Property);
return Property;
};

// Parse all the comma separated declarators.
Expand Down Expand Up @@ -2024,7 +2025,7 @@ void Parser::ParseObjCClassInstanceVariables(ObjCContainerDecl *interfaceDecl,
continue;
}

auto ObjCIvarCallback = [&](ParsingFieldDeclarator &FD) {
auto ObjCIvarCallback = [&](ParsingFieldDeclarator &FD) -> Decl * {
assert(getObjCDeclContext() == interfaceDecl &&
"Ivar should have interfaceDecl as its decl context");
// Install the declarator into the interface decl.
Expand All @@ -2035,6 +2036,7 @@ void Parser::ParseObjCClassInstanceVariables(ObjCContainerDecl *interfaceDecl,
if (Field)
AllIvarDecls.push_back(Field);
FD.complete(Field);
return Field;
};

// Parse all the comma separated declarators.
Expand Down
Loading
Loading