Skip to content

change to analyzer snapshots bumps dart binary size #50487

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
srawlins opened this issue Nov 16, 2022 · 16 comments
Closed

change to analyzer snapshots bumps dart binary size #50487

srawlins opened this issue Nov 16, 2022 · 16 comments
Labels
area-build Use area-build for SDK build issues. legacy-area-analyzer Use area-devexp instead. P2 A bug or feature request we're likely to work on

Comments

@srawlins
Copy link
Member

@jensjoha landed a change to how analyzer snapshots are trained.

This resulted in a sizeable bump to the size of the dart binary. This benchmark shows a bump from ~24MB to ~51MB. Is it acceptable and worth it? I have no idea who can/should answer this...

@jensjoha, the change "makes it faster when first loaded". Given the case of a user that bumps their Dart SDK from 2.18 to 2.19, or a similar bump which maybe happens 2-4x a year, is the performance literally only improved on the first time one of these snapshots is invoked? Is the performance improvement seen only 2-4 runs a year?

I would think that better training would mean that performance is improved on every run...

CC @bwilkerson @athomas @mraleph @sortie

@srawlins srawlins added legacy-area-analyzer Use area-devexp instead. area-build Use area-build for SDK build issues. labels Nov 16, 2022
@mraleph
Copy link
Member

mraleph commented Nov 16, 2022

@jensjoha Could you try compiling analyzer as an AOT binary and see how big that is and how well it performs against the JIT version?

@mnordine
Copy link
Contributor

This resulted in a sizeable bump to the size of the dart binary

You mean the analyzer snapshot, not the dart binary, right?

@athomas
Copy link
Member

athomas commented Nov 16, 2022

/cc @mit-mit

@mnordine
Copy link
Contributor

mnordine commented Nov 16, 2022

FWIW, on Windows, with an empty project, dart analyze
2.18.4: 807ms
master: 367ms

@mnordine
Copy link
Contributor

mnordine commented Nov 16, 2022

I've tried it on several projects, and, as expected with this change, it reduces total time by about 400-600ms, due to reduced startup time.

@mnordine
Copy link
Contributor

Just my 2 cents, but I'd be happy to pay the cost of 25MB on the SDK download to not pay that startup cost every time the analyzer starts.

@srawlins
Copy link
Member Author

You mean the analyzer snapshot, not the dart binary, right?

I'm honestly not sure what is being tracked in that benchmark. The title made me assume the dart binary, but maybe it's a snapshot file or the zipped up SDK or something.

@mnordine
Copy link
Contributor

mnordine commented Nov 16, 2022

Just on my machine, it's the analyzer snapshot file that increases in file size from ~25MB to ~50MB after that commit, nothing to do with the dart executable.

@jensjoha
Copy link
Contributor

@jensjoha, the change "makes it faster when first loaded". Given the case of a user that bumps their Dart SDK from 2.18 to 2.19, or a similar bump which maybe happens 2-4x a year, is the performance literally only improved on the first time one of these snapshots is invoked? Is the performance improvement seen only 2-4 runs a year?

There seems to be some misconception here. Whenever you load the analyzer (say open an IDE), before the VM started basically without any code compiled and would spend time doing that, making optimizations once the it had that information etc. Now it's ~just there from the start.
Even if the analyzer has all base code cached it will still be faster --- though I haven't benchmarked it, this certainly should include completions, invalidations etc although it naturally won't be going from 20 seconds to 10 seconds (given that a completion shouldn't take 20 seconds to begin with).
This is every time the analyzer starts. I will try to do some benchmarks though.
(Whenever using new code for the first time, though, (say a new package or a new version of a package, an updated flutter, or a change in own code that invalidates big chunks) it will - with big enough code - still be a 20 -> 10 seconds thing for the case where the analyzer before hadn't warmed up).

@jensjoha
Copy link
Contributor

jensjoha commented Nov 17, 2022

Let me try to take everything in order:

This resulted in a sizeable bump to the size of the dart binary. This benchmark shows a bump from ~24MB to ~51MB. Is it acceptable and worth it?

The dart binary doesn't change. The file analysis_server.dart.snapshot goes from 25MB to 51MB. Worth noticing when looking at these numbers is that the total sdk size (according to golem) goes from ~590 MB to ~620 MB, which is only a ~5% bump.
I knew all of this when landing and I sort of assumed the reviewers did as well (though I could be wrong). I guess I don't have any authority here, but I think it's worth it (which is also why I landed it).
(One could also argue that the space consumption was just previously too low as the training was only "temporarily" disabled by https://dart-review.git.corp.google.com/c/sdk/+/77228 --- though that was 2018).

@jensjoha, the change "makes it faster when first loaded". Given the case of a user that bumps their Dart SDK from 2.18 to 2.19, or a similar bump which maybe happens 2-4x a year, is the performance literally only improved on the first time one of these snapshots is invoked? Is the performance improvement seen only 2-4 runs a year?

I would think that better training would mean that performance is improved on every run...

I adressed this already but will do it again for completion.
I suspect the confusion comes from me stating that "[...] but will certainly - as seen above - make it faster when first loaded".
By that I didn't mean the first time the snapshot was ever run, but rather whenever the snapshot was started. It sort of starts at peak performance and thus, yes, performance is improved on every run.

The golem benchmarks also says this: "analysis-server-cold-analysis: -21.7 % RunTimeRaw" and "analysis-server-warm-analysis: -24.1 % RunTimeRaw".

The numbers will naturally vary depending on what's analyzed, but when I try on "pkg/compiler/lib" before and after with the cache already filled (via dart analyze pkg/compiler/lib) I get 1.813, 1.813, 1.850, 1.824, 1.955 before and 1.026, 1.018, 1.027, 1.041, 1.100 now which is -43.6845% +/- 3.82519%. So whenever I am to open an editor with dart2js in it the startup of the analyzer will be ~40% faster.

@jensjoha Could you try compiling analyzer as an AOT binary and see how big that is and how well it performs against the JIT version?

I don't think an AOT snapshot can function the way the app-jit one does as there is some weird replacement thing happening when it's started from pkg/dartdev (e.g. language-server via VmInteropHandler.run).
The exe created is 24MB and it takes ~40 seconds.
Comparing speed using the --train-using command line (the way dart analyze works involves sending a command to the process once it has started and is thus a little more involved) I get 11.2s, 11.1s, 11.0s with the snapshot and 8.7s, 8.6s, 8.4s with the exe, so it's actually quite significantly faster (-22.8228% +/- 2.63618%). As such it seems like a good way to go, but as said, because if the integration I don't think it's an easy thing.

You mean the analyzer snapshot, not the dart binary, right?

Yes, you are right.

FWIW, on Windows, with an empty project, dart analyze
2.18.4: 807ms
master: 367ms

Super. It's not just on my machine :D

Just my 2 cents, but I'd be happy to pay the cost of 25MB on the SDK download to not pay that startup cost every time the analyzer starts.

I agree. Though to be fair it's ~25MB of storage. The sdk download is zipped and the difference is going to be less. Trying to just zip the new snapshot the output is 20MB, zipping the old snapshot the output is 11MB, so the change in download size is likely more like 9MB.

@srawlins
Copy link
Member Author

@jensjoha thanks for clarifying and for the additional benchmarks! I really only filed this because the golem benchmark showed the jump which seemed to warrant the discussion.

A 5% bump to total SDK size seems totally fine to me. 🤷

@mraleph
Copy link
Member

mraleph commented Nov 17, 2022

As such it seems like a good way to go, but as said, because if the integration I don't think it's an easy thing.

Could you file a bug to track this. I think we should figure a way to go this route also it would align with what we are running on Golem - all those benchmarks are AOT compiled.

The exe created is 24MB and it takes ~40 seconds.

So it is not that much different from JIT snapshot. (24Mb exe means around 20Mb of AOT snapshot).

@jensjoha
Copy link
Contributor

So it is not that much different from JIT snapshot. (24Mb exe means around 20Mb of AOT snapshot).

It's smaller than the trained jit snapshot :)

@jensjoha
Copy link
Contributor

Could you file a bug to track this

#50498

@pq pq added the P2 A bug or feature request we're likely to work on label Nov 17, 2022
@mit-mit
Copy link
Member

mit-mit commented Nov 21, 2022

The mentioned SDK size increase SGTM

@srawlins
Copy link
Member Author

I'm satisfied; thanks for everyone's patience and responses!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-build Use area-build for SDK build issues. legacy-area-analyzer Use area-devexp instead. P2 A bug or feature request we're likely to work on
Projects
None yet
Development

No branches or pull requests

7 participants