-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[clang-doc] Improve clang-doc performance through memoization #96809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
761d1b6
remove USR
PeterChou1 d3bf9cf
test
PeterChou1 8a2cb8f
[clang-doc] add short circuit in mapper
PeterChou1 acb593b
[clang-doc] add short circuit to improve performance
PeterChou1 5970a81
[clang-doc] distinguish between declaration and definition
PeterChou1 963f0a3
[clang-doc] address pr comments
PeterChou1 e4391b2
[clang-doc] remove debug code
PeterChou1 851f560
[clang-doc] switch to string ref
PeterChou1 5998283
[clang-doc] account anonymous typedef
PeterChou1 59fb340
[clang-doc] clang-format + addressing pr comments
PeterChou1 2486604
[clang-doc] fix compiler warning
PeterChou1 d88c1e7
[clang-doc] status
PeterChou1 2197e88
[clang-doc] remove useless try catch
PeterChou1 3fd1d92
[clang-doc] clang-format
PeterChou1 0bd5a3e
[clang-doc] remove debugging line
PeterChou1 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this logic makes sense. Why is it unreasonable to expect a USR to be valid only when its a definition? Is there some logic that prevents things that aren't definitions from being able to hold documentation? What I'm worried about, is the case where we have several USRs that contain different documentation bits that would have been merged in the past, but now wont.
If there is a concrete reason, please document that here in the comments, and in the commit message.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My logic was that definition of the USR is parsed last, so when the ASTVisitor visits the definition it would have already parsed every other USR that points to the same declaration. So we can safely short circuit, since every other fragments of USR would've been parsed already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really parsed last in all cases? Isn't it possible to have multiple of these definitions, depending on the scope of the AST construct and the target options. For instance, what if some code is compiled w/ a particular
#define
enabled and that provides different documentation than was found previously, but is compiled elsewhere without that define? I can easily imagine that affecting header code, where documentation is likely to be.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I hadn't considered that this was my clumsy attempt at trying to remove the redundant work done by the ASTVisitor. Since we are undeniably doing some redundant work. if you take a look at the Shape class from the e2e test you'll see that we visit the declaration 3 times once for parsing the initial file and the twice more for each subclass. Is there any other mechanism that prevents this type of behaviour? This essentially what this patch is trying to accomplish
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is why we need more tests that exercise a diverse set of behavior. Some of the different targets should control the documentation bits w/ defines, and we should also make sure that we're covering all the ways a USR could be brought in. For instance, if there was only one base class that brought in
Shape
, I'm not sure how much better you could do, since I'd assume it would only be included once, and then parsed in full the one time.For more complicated arrangements, we'd need to be sure the definitions, and the documentation would be resolved the same way in all cases. I'm not aware of a mechanism that does that off the top of my head, but I would expect the
clangd
indexing mechanic probably has a way to handle this on some level. I'd also expect that the logic inscan-deps
must have some mechanism for not processing files multiple times.I'm not confident enough to say what you're proposing is outright wrong, but I'm also not confident that its correct, either. What I'm saying is that we need to be sure, that when we see a definition in
clang-doc
that it won't change from out beneath us. That said, given the merge logic in the reduction phase, perhaps if the USR is complete(i.e. has no missing fields), the memoization is sufficient.