Skip to content

Don't modify the on disk cache in fine-grained mode #4664

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

msullivan
Copy link
Collaborator

This is a little subtle, because interface_hash still needs to be
computed, as it is a major driver of the coarse-grained build process.

Since metas are no longer computed for files that get rechecked during
build, to avoid spuriously reprocessing them we need to find initial
file state in cache mode as well.

@msullivan msullivan requested a review from JukkaL March 2, 2018 02:39
msullivan added a commit that referenced this pull request Mar 2, 2018
They'll still merge conflict, but the resolution will be trivial now
(I'm trying avoid making one depend on the other)
msullivan added a commit that referenced this pull request Mar 2, 2018
They'll still merge conflict, but the resolution will be trivial now
(I'm trying avoid making one depend on the other)
Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a few notes below. Please test this change manually with and without a remote cache before merging.

It would be useful to have the motivation for this change spelled out in the commit message. Is it more about performance of incremental updates or about correctness?

Longer-term, we may want to write cache files in fine-grained incremental mode, at least when not using a remote cache. This could improve performance if users frequently restart the daemon and don't use a remote cache (or if we decide to shut down the daemon after some time of inactivity). Can you create an issue about this?

if not self.options.use_fine_grained_cache:
# Stores the initial state of sources as a side effect.
self.fswatcher.find_changed()
# Stores the initial state of sources as a side effect.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this changed to be executed unconditionally?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that we have accurate fswatcher cache information for files that we didn't read from the on-disk cache, now that we don't generate CacheMetas

# Run a fine-grained update starting from the cached data
if self.options.use_fine_grained_cache:
# Pull times and hashes out of the saved_cache and stick them into
# the fswatcher, so we pick up the changes.
for state in self.fine_grained_manager.graph.values():
meta = state.meta
# If there isn't a meta, that means the current
# version got checked in the initial build.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relying on this logic here seems a bit fragile. Maybe move the meta is None check to a helper method in State and use the helper method here instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing it to use is_cache_skeleton instead.

@msullivan
Copy link
Collaborator Author

The main rationale is closer to correctness than performance. Currently we write and delete cache files during the initial load of fine-grained mode, but never during fine-grained updates. This means that fine-grained can mess the cache up but won't fix it without a restart, which seems not great.

This is a little subtle, because interface_hash still needs to be
computed, as it is a major driver of the coarse-grained build process.

Since metas are no longer computed for files that get rechecked during
build, to avoid spuriously reprocessing them we need to find initial
file state in cache mdoe as well.
@msullivan msullivan force-pushed the no_fg_cache_write branch from f9076ed to 8b3702d Compare March 5, 2018 17:52
@msullivan
Copy link
Collaborator Author

This has a bad interaction with #4669 and needs changes. My plan is to land #4669 and revise this after.

@msullivan
Copy link
Collaborator Author

Withdrawing this. Some changes I am making as part of my deletion optimization path is going to make this fall out much more simply.

@msullivan msullivan closed this Mar 8, 2018
@msullivan msullivan deleted the no_fg_cache_write branch March 8, 2018 00:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants