Skip to content

[clang] Migration to PointerIntPair #69835

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 of 16 tasks
Endilll opened this issue Oct 21, 2023 · 6 comments
Open
1 of 16 tasks

[clang] Migration to PointerIntPair #69835

Endilll opened this issue Oct 21, 2023 · 6 comments
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" code-quality

Comments

@Endilll
Copy link
Contributor

Endilll commented Oct 21, 2023

Here is an incomplete list of ad-hoc tagged pointers that should be refactored to use llvm::PointerIntPair:

  • clang::ActionResult::Value
  • clang::ASTWriter::DeclOrType::Stored
  • clang::CGBlockInfo::Capture::Data
  • clang::DeclAccessPair::Ptr
  • clang::DeclarationName::Ptr
  • clang::DesignatedInitExpr::Designator::NameOrField
  • clang::IdentifierResolver::Iterator::Ptr
  • clang::LazyOffsetPtr::Ptr
  • clang::NodeGroup::P
  • clang::ObjCMessageExpr::SelectorOrMethod
  • clang::OffsetOfNode::Data
  • clang::Selector::InfoPtr
  • clang::StmtIteratorBase::RawVAPtr
  • clang::Token::PtrData
  • clang::VTableComponent::Value
  • clang::api_notes::FunctionInfo::NullabilityPayload

Apart from inconsistency with newer code that uses PointerIntPair, ad-hoc tagged pointers require boilerplate on the side of debug formatters.

@Endilll Endilll added clang Clang issues not falling into any other category clang:static analyzer code-quality labels Oct 21, 2023
@llvmbot
Copy link
Member

llvmbot commented Oct 21, 2023

@llvm/issue-subscribers-clang-static-analyzer

Author: Vlad Serebrennikov (Endilll)

Here is an incomplete list of ad-hoc tagged pointers that should be refactored to use `llvm::PointerIntPair`: - [ ] clang::ActionResult::Value - [ ] clang::ASTWriter::DeclOrType::Stored - [ ] clang::CGBlockInfo::Capture::Data - [ ] clang::DeclAccessPair::Ptr - [ ] clang::DeclarationName::Ptr - [ ] clang::DesignatedInitExpr::Designator::NameOrField - [ ] clang::IdentifierResolver::Iterator::Ptr - [ ] clang::LazyOffsetPtr::Ptr - [ ] clang::NodeGroup::P - [ ] clang::ObjCMessageExpr::SelectorOrMethod - [ ] clang::OffsetOfNode::Data - [ ] clang::Selector::InfoPtr - [ ] clang::StmtIteratorBase::RawVAPtr - [ ] clang::Token::PtrData - [ ] clang::VTableComponent::Value - [ ] clang::api_notes::FunctionInfo::NullabilityPayload - [ ] clang::ento::SVal::Data

Apart from inconsistency with newer code that uses PointerIntPair, ad-hoc tagged pointers require boilerplate on the side of debug formatters.

@Endilll Endilll changed the title [clang] Migration to PointerIntPair [clang] Migration to PointerIntPair Oct 21, 2023
@Endilll Endilll added clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. and removed clang Clang issues not falling into any other category labels Oct 23, 2023
@llvmbot
Copy link
Member

llvmbot commented Oct 23, 2023

@llvm/issue-subscribers-clang-codegen

Author: Vlad Serebrennikov (Endilll)

Here is an incomplete list of ad-hoc tagged pointers that should be refactored to use `llvm::PointerIntPair`: - [ ] `clang::ActionResult::Value` - [ ] `clang::ASTWriter::DeclOrType::Stored` - [ ] `clang::CGBlockInfo::Capture::Data` - [ ] `clang::DeclAccessPair::Ptr` - [ ] `clang::DeclarationName::Ptr` - [ ] `clang::DesignatedInitExpr::Designator::NameOrField` - [ ] `clang::IdentifierResolver::Iterator::Ptr` - [ ] `clang::LazyOffsetPtr::Ptr` - [ ] `clang::NodeGroup::P` - [ ] `clang::ObjCMessageExpr::SelectorOrMethod` - [ ] `clang::OffsetOfNode::Data` - [ ] `clang::Selector::InfoPtr` - [ ] `clang::StmtIteratorBase::RawVAPtr` - [ ] `clang::Token::PtrData` - [ ] `clang::VTableComponent::Value` - [ ] `clang::api_notes::FunctionInfo::NullabilityPayload` - [ ] `clang::ento::SVal::Data`

Apart from inconsistency with newer code that uses PointerIntPair, ad-hoc tagged pointers require boilerplate on the side of debug formatters.

@llvmbot
Copy link
Member

llvmbot commented Oct 23, 2023

@llvm/issue-subscribers-clang-frontend

Author: Vlad Serebrennikov (Endilll)

Here is an incomplete list of ad-hoc tagged pointers that should be refactored to use `llvm::PointerIntPair`: - [ ] `clang::ActionResult::Value` - [ ] `clang::ASTWriter::DeclOrType::Stored` - [ ] `clang::CGBlockInfo::Capture::Data` - [ ] `clang::DeclAccessPair::Ptr` - [ ] `clang::DeclarationName::Ptr` - [ ] `clang::DesignatedInitExpr::Designator::NameOrField` - [ ] `clang::IdentifierResolver::Iterator::Ptr` - [ ] `clang::LazyOffsetPtr::Ptr` - [ ] `clang::NodeGroup::P` - [ ] `clang::ObjCMessageExpr::SelectorOrMethod` - [ ] `clang::OffsetOfNode::Data` - [ ] `clang::Selector::InfoPtr` - [ ] `clang::StmtIteratorBase::RawVAPtr` - [ ] `clang::Token::PtrData` - [ ] `clang::VTableComponent::Value` - [ ] `clang::api_notes::FunctionInfo::NullabilityPayload` - [ ] `clang::ento::SVal::Data`

Apart from inconsistency with newer code that uses PointerIntPair, ad-hoc tagged pointers require boilerplate on the side of debug formatters.

@steakhal
Copy link
Contributor

AFAIK clang::ento::SVal::Data is not a bitpacked pointer, I think you can drop it from the list.
We have a bitpacked unsigned Kind value, which I don't think we can/should convert to PointerIntPair.
I believe, there is room for improvement though, but I'd need to think about how to untangle how our unsigned value is used actually.

@Endilll
Copy link
Contributor Author

Endilll commented Oct 23, 2023

AFAIK clang::ento::SVal::Data is not a bitpacked pointer, I think you can drop it from the list.

Upon closer inspection, I agree with you. I saw void * and enum { BaseBits = 2, BaseMask = 0b11 };, so I made a quick conclusion that they are related, but pointer is indeed not packed.

We have a bitpacked unsigned Kind value, which I don't think we can/should convert to PointerIntPair.
I believe, there is room for improvement though, but I'd need to think about how to untangle how our unsigned value is used actually.

Look at the following bit of implementation, it appears that you can break it down into two bit-fields: 2 bits and 30 bits (or less):

BaseKind getBaseKind() const { return (BaseKind) (Kind & BaseMask); }
unsigned getSubKind() const { return Kind >> BaseBits; }

Even if you need to combine them into single number for whatever reason, it should be trivial. But every time we do ad-hoc bit-packing without a type information attached that tells how to unpack it, debugging experience suffers.

#69104 should help with bit-field cases where it's impossible to let compiler know the type, because of Microsoft ABI implications.

@Endilll
Copy link
Contributor Author

Endilll commented Oct 24, 2023

clang::DeclarationName::Ptr turned out to be a hard nut to crack. It stores enumeration with 8 values, and pointer to 4 types. All said types are aligned on 8-byte boundary, so only 3 lower bits are available. It would be easy to crack into a PointerUnion and a single-bit integer if at most two enumerators were associated with any given type (2 bits for pointer discriminator between 4 types + 1 bit for two possible enumerators). Unfortunately, mapping is 3-3-1-1 (enumerators per each type). PointerIntPair and PointerUnion do not combine in a way clever enough to replace that.

There are not-so-good solutions like refactoring DeclarationName into storing 8 types (one per each enumerator), or aligning all 4 types on 16-byte boundary to make enough room for 2-bit PointerUnion discriminator and 2-bit integer for 3 possible enumeration values. I'm not going to push for any of those myself, providing a custom LLDB formatter for DeclarationName.

Endilll added a commit that referenced this issue Oct 25, 2023
Refactor `uintptr_t` inside of `clang::Selector` that holds a pointer to `IdentifierInfo` or `MultiKeywordSelector` and `IdentifierInfoFlag` enum into `PointerIntPair`. This is a part of `PointerIntPair` migration outlined in #69835.

Unlike `uintpt_t`, `PointerIntPair` required pointee types to be complete, so I had to shuffle definitions of `MultiKeywordSelector` and `detail::DeclarationNameExtra` around to make them complete at `Selector`. Also, there were outdated specializations of `PointerLikeTypeTraits` for `IdentifierInfo *`, which are no longer needed, because `alignof` that primary template use works just fine. Not just that, but they declared that `IdentifierInfo *` has only 1 spare lower bit, but today they are 8-byte aligned.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" code-quality
Projects
None yet
Development

No branches or pull requests

3 participants