Skip to content

memoize the common fragments calculation for diffing overloads #992

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 6, 2024

Conversation

QuietMisdreavus
Copy link
Contributor

Bug/issue #, if applicable: None

Summary

This PR refactors the overload declaration diffing code to memoize the results so that they don't need to be recalculated. The original diffing code recalculates the "common fragments" every time it needs to render a page. However, this wastes work for overloads in the same group, since each symbol's page needs to be calculated individually. By saving those results, we can load the common fragments without having to perform an expensive calculation.

This benchmark was run on a large framework with many overloaded methods:

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Metric                                   │ Change          │ main                 │ current              │
├──────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Duration for 'bundle-registration'       │ no change¹,²    │ 9.523 sec            │ 9.475 sec            │
│ Duration for 'convert-total-time'        │ -2.796%³        │ 14.724 sec           │ 14.312 sec           │
│ Duration for 'documentation-processing'  │ -7.142%⁴        │ 5.091 sec            │ 4.727 sec            │
│ Duration for 'finalize-navigation-index' │ no change⁵      │ 0.04 sec             │ 0.04 sec             │
│ Peak memory footprint                    │ no change⁶      │ 698.7 MB             │ 704 MB               │
│ Data subdirectory size                   │ no change⁷      │ 129.8 MB             │ 129.8 MB             │
│ Index subdirectory size                  │ no change⁸      │ 1.1 MB               │ 1.1 MB               │
│ Total DocC archive size                  │ no change⁹      │ 232.1 MB             │ 232.1 MB             │
│ Topic Anchor Checksum                    │ no change       │ 53074b40863239b7a00d │ 53074b40863239b7a00d │
│ Topic Graph Checksum                     │ no change       │ f4af046ddef9eab49ba8 │ f4af046ddef9eab49ba8 │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Dependencies

None

Testing

Ensure that no functionality has changed.

Checklist

Make sure you check off the following items. If they cannot be completed, provide a reason.

  • [ n/a ] Added tests
  • Ran the ./bin/test script and it succeeded
  • [ n/a ] Updated documentation if necessary

@QuietMisdreavus
Copy link
Contributor Author

@swift-ci Please test

@QuietMisdreavus
Copy link
Contributor Author

@swift-ci Please test Linux

Copy link
Contributor

@patshaughnessy patshaughnessy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a nice speedup 👍 - just some questions before I approve this.

/// Translates a symbol's declaration into a render node's Declarations section.
struct DeclarationsSectionTranslator: RenderSectionTranslator {
/// A mapping from a symbol reference to the "common fragments" in its overload group.
static let commonFragmentsMap: Synchronized<[ResolvedTopicReference: [SymbolGraph.Symbol.DeclarationFragments.Fragment]]> = .init([:])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why define this as static? It will work fine, I think, but would fail someday if we ever compile more than one documentation project in the same DocC process, because we would have values left over from previous runs. As written, we assume the map will only ever be used once. Might be safer to just make this a normal attribute of DeclarationsSectionTranslator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeclarationsSectionTranslator instances are not shared between different symbols' renderings. If i made this an instance property, the saved results would not be available the next time around.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, ok makes sense.

fragments: [SymbolGraph.Symbol.DeclarationFragments.Fragment])
{
Self.commonFragmentsMap.sync({ map in
for reference in references {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why bother saving the fragments for all of the overload declarations? Reading this, I'm surprised that setCommonFragments takes an array of references, but commonFragments only queries one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire trick being relied upon here is that the list of common fragments is going to be the same for a group of overloaded declarations. So the first one in a group that gets rendered can save the calculation for the whole group. So when we check for a cached result, we only need to check for one of the declarations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the mainDeclaration will change depending on when we call commonFragments below? Thanks for the explanation.

for mainDeclaration: OverloadDeclaration,
overloadDeclarations: [OverloadDeclaration]
) -> [SymbolGraph.Symbol.DeclarationFragments.Fragment] {
if let fragments = commonFragments(for: mainDeclaration.reference) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the previous comment. If you only ever query the fragments for the main declaration here, then why save the same fragments for each of the overloads in the call to setCommonFragments below?


// Collect the "common fragments" so we can highlight the ones that are different
// in each declaration
let commonFragments = longestCommonSubsequence(preProcessedDeclarations)
let commonFragments = commonFragments(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, it appears you only ever need the common fragments for the main declaration, and not for the individual overloads.

@patshaughnessy patshaughnessy self-requested a review July 25, 2024 14:34
@QuietMisdreavus
Copy link
Contributor Author

@swift-ci Please test

@QuietMisdreavus QuietMisdreavus merged commit afa6752 into swiftlang:main Aug 6, 2024
2 checks passed
@QuietMisdreavus QuietMisdreavus deleted the memoize-diff branch August 6, 2024 21:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants