-
Notifications
You must be signed in to change notification settings - Fork 27
Optimization: cache results between TypeScript projects #182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Towards #175 Previously, scip-typescript didn't cache anything at all between TypeScript projects. This commit implements an optimization so that we now cache the results of loading source files and parsing options. Benchmarks against the sourcegraph/sourcegraph repo indicate this optimization consistently speeds up the `index` command in all three multi-project repositories that I tested it with. - sourcegraph/sourcegraph: ~30% from ~100s to ~70s - nextautjs/next-auth: ~40% from 6.5s to 3.9 - xtermjs/xterm.js: ~45% from 7.3s to 4.1s For every repo, I additionally validated that the resulting index.scip has identical checksum before and after applying this optimization. Given these promising results, this new optimization is enabled by default, but can be disabled with the option `--no-global-cache`. *Test plan* Manually tested by running `scip-typescript index tsconfig.all.json` in the sourcegraph/sourcegraph repository. To benchmark the difference for this PR: - Checkout the code - Run `yarn tsc -b` - Go to the directory of your project - Run `node PATH_TO_SCIP_TYPESCRIPT/dist/src/main.js` - Copy the "optimized" index.scip with `cp index.scip index-withcache.scip` - Run `node PATH_TO_SCIP_TYPESCRIPT/dist/src/main.js --no-global-caches` - Validate the checksum is identical from the optimized output `shasum -a 256 *.scip`
| if (!hostCopy.getParsedCommandLine) { | ||
| return undefined | ||
| } | ||
| if (cache.parsedCommandLines.has(fileName)) { | ||
| return cache.parsedCommandLines.get(fileName) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand all the cases here; is it possible that the cache has a hit for this even though hostCopy.getParsedCommandLine is missing? (when is getParsedCommandLine missing?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be misunderstanding your comment, but we guard against hostCopy.getParsedCommandLine at the top, so even if the cache has a hit we return undefined when hostCopy.getParsedCommandLine is undefined.
| const fromCache = cache.sources.get(fileName) | ||
| if (fromCache !== undefined) { | ||
| return fromCache | ||
| } | ||
| const result = hostCopy.getSourceFile(fileName, languageVersion) | ||
| if (result !== undefined) { | ||
| cache.sources.set(fileName, result) | ||
| } | ||
| return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Maybe add a comment here or near the cache definition which says "It is safe use the file name as a key here because we assume that a source file only ever belongs to a single project, and is only ever type-checked with a single language version".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call! I added an additional check to additionally only cache when languageVersion is unchanged (excluding a setExternalModuleIndicator field that is a function that is never the same. I also added a comment explaining in more details. I also discovered that getSourceFile has two optional parameters that we're now properly forwarding.
|
Very nice! |
|
|
||
| // defaults | ||
| checkIndexParser([], { | ||
| progressBar: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this get removed? Shouldn't there be an extra globalCaches: true here instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to mention it in the description but this PR also disables the fancy progress bar by default. I didn't measure any performance difference with it enabled/disabled, but I found it distracting while testing projects. scip-typescript runs >99% of the time in CI where you should disable it by default anyways and I'm guessing most people are not doing it.
olafurpg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review! 🙏🏻
|
|
||
| // defaults | ||
| checkIndexParser([], { | ||
| progressBar: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to mention it in the description but this PR also disables the fancy progress bar by default. I didn't measure any performance difference with it enabled/disabled, but I found it distracting while testing projects. scip-typescript runs >99% of the time in CI where you should disable it by default anyways and I'm guessing most people are not doing it.
| if (!hostCopy.getParsedCommandLine) { | ||
| return undefined | ||
| } | ||
| if (cache.parsedCommandLines.has(fileName)) { | ||
| return cache.parsedCommandLines.get(fileName) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be misunderstanding your comment, but we guard against hostCopy.getParsedCommandLine at the top, so even if the cache has a hit we return undefined when hostCopy.getParsedCommandLine is undefined.
| const fromCache = cache.sources.get(fileName) | ||
| if (fromCache !== undefined) { | ||
| return fromCache | ||
| } | ||
| const result = hostCopy.getSourceFile(fileName, languageVersion) | ||
| if (result !== undefined) { | ||
| cache.sources.set(fileName, result) | ||
| } | ||
| return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call! I added an additional check to additionally only cache when languageVersion is unchanged (excluding a setExternalModuleIndicator field that is a function that is never the same. I also added a comment explaining in more details. I also discovered that getSourceFile has two optional parameters that we're now properly forwarding.
Towards #175
Previously, scip-typescript didn't cache anything at all between TypeScript projects. This commit implements an optimization so that we now cache the results of loading source files and parsing options. Benchmarks against the sourcegraph/sourcegraph repo indicate this optimization consistently speeds up the
indexcommand in all three multi-project repositories that I tested it with.For every repo, I additionally validated that the resulting index.scip has identical checksum before and after applying this optimization. Given these promising results, this new optimization is enabled by default, but can be disabled with the option
--no-global-cache.Test plan
Manually tested by running
scip-typescript index tsconfig.all.jsonin the sourcegraph/sourcegraph repository. To benchmark the difference for this PR:yarn tsc -bnode PATH_TO_SCIP_TYPESCRIPT/dist/src/main.jscp index.scip index-withcache.scipnode PATH_TO_SCIP_TYPESCRIPT/dist/src/main.js --no-global-cachesshasum -a 256 *.scip