Skip to content

x/tools/gopls: incremental gopls scaling #57987

Closed
@findleyr

Description

@findleyr

Incremental gopls

This is an umbrella issue to track the work we've been doing to change the way gopls scales (and significantly reduce memory usage and CPU on large codebases). We've been working off an internal design, but I wanted to share some of that here and have a public issue to track the work.

The main goal of this project is to use on-disk export data and indexing to allow gopls to persist data across sessions, reloading on-demand, thereby reducing its overall memory footprint and startup time.

As a result of this change, gopls memory usage (at least related to package information) will be O(open packages), rather than O(workspace). Furthermore, we will break global relationships between elements of gopls' cache, simplifying gopls' execution model and eliminating multiple categories of bugs.

Background

Gopls, as it very recently existed, was a monolithic build system. Certain algorithms relied on a global package graph containing the full set of workspace packages. For example, to find references to a given identifier, gopls simply walked all packages in the reverse transitive cone of the declaring package(s) and checked types.Info.Uses for references to the declared object. In order for this algorithm to be correct and sufficiently fast, gopls must hold all type-checked packages in memory (including many redundant intermediate test variants!).

We can't solve gopls' scaling problems until we rewrite these algorithms and fix all the places where gopls makes assumptions of global identity. This is what we've been working on.

High level plan

  1. Design a shallow export data format that does not bundle its transitive closure.
  2. Add a mechanism for on-disk caching.
  3. Implement package analysis using export data.
  4. Rewrite workspace-wide queries to use indexes that are independent of types.Package or types.Object identity.
  5. Separate the concept of a "syntax package" (something containing AST and types.Info) from an "export package" (a types.Package with type information for exported symbols). Syntax packages are used to fully understand the syntax a user is working on. Export packages are used for type checking. Currently, gopls has a concept of "exported parse mode", which produces a syntax package on a truncated AST. This exists to reduce memory, but means that syntax packages may be partial, a source of many historical and current bugs. All current uses of partial packages in the package graph can (and must) be eliminated or replaced with a judicious use of parsing or type-checking on-demand.
  6. Create a control plane to manage package information that must be preserved in memory vs re-computed on demand or transiently cached.
  7. When importing during type-checking, use export packages for packages outside the workspace. For now, continue to produce syntax packages for all packages inside the workspace.
  8. Persist and load export packages from disk.
  9. Load xrefs and methodset indexes from disk, rather than hanging them off of syntax packages.
  10. Drop all syntax packages, except those with open files.
  11. Implement precise pruning
  12. Investigate holding on to packages imported by open packages, to reduce re-type-checking latency.
  13. Revisit diagnostic storage and retrieval: diagnostics are re-accessed in multiple places, assuming they will be free to retrieve: fix TestBadlyVersionedModule
  14. Improve the UX around indexing (better progress notifications, partial results, etc).

Metadata

Metadata

Labels

FrozenDueToAgeToolsThis label describes issues relating to any tools in the x/tools repository.goplsIssues related to the Go language server, gopls.gopls/performanceIssues related to gopls performance (CPU, memory, etc).

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions