-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Don't modify the on disk cache in fine-grained mode #4664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
They'll still merge conflict, but the resolution will be trivial now (I'm trying avoid making one depend on the other)
They'll still merge conflict, but the resolution will be trivial now (I'm trying avoid making one depend on the other)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a few notes below. Please test this change manually with and without a remote cache before merging.
It would be useful to have the motivation for this change spelled out in the commit message. Is it more about performance of incremental updates or about correctness?
Longer-term, we may want to write cache files in fine-grained incremental mode, at least when not using a remote cache. This could improve performance if users frequently restart the daemon and don't use a remote cache (or if we decide to shut down the daemon after some time of inactivity). Can you create an issue about this?
if not self.options.use_fine_grained_cache: | ||
# Stores the initial state of sources as a side effect. | ||
self.fswatcher.find_changed() | ||
# Stores the initial state of sources as a side effect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this changed to be executed unconditionally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that we have accurate fswatcher
cache information for files that we didn't read from the on-disk cache, now that we don't generate CacheMeta
s
mypy/dmypy_server.py
Outdated
# Run a fine-grained update starting from the cached data | ||
if self.options.use_fine_grained_cache: | ||
# Pull times and hashes out of the saved_cache and stick them into | ||
# the fswatcher, so we pick up the changes. | ||
for state in self.fine_grained_manager.graph.values(): | ||
meta = state.meta | ||
# If there isn't a meta, that means the current | ||
# version got checked in the initial build. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relying on this logic here seems a bit fragile. Maybe move the meta is None
check to a helper method in State
and use the helper method here instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing it to use is_cache_skeleton
instead.
The main rationale is closer to correctness than performance. Currently we write and delete cache files during the initial load of fine-grained mode, but never during fine-grained updates. This means that fine-grained can mess the cache up but won't fix it without a restart, which seems not great. |
This is a little subtle, because interface_hash still needs to be computed, as it is a major driver of the coarse-grained build process. Since metas are no longer computed for files that get rechecked during build, to avoid spuriously reprocessing them we need to find initial file state in cache mdoe as well.
f9076ed
to
8b3702d
Compare
Withdrawing this. Some changes I am making as part of my deletion optimization path is going to make this fall out much more simply. |
This is a little subtle, because interface_hash still needs to be
computed, as it is a major driver of the coarse-grained build process.
Since metas are no longer computed for files that get rechecked during
build, to avoid spuriously reprocessing them we need to find initial
file state in cache mode as well.