-
Notifications
You must be signed in to change notification settings - Fork 212
Run git gc periodically on the crates.io index #778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@pietroalbini @jyn514 the most relevant place I found to insert it is in one of the daemons such as: If you think it's a fitting location, I'll start working on a PR. |
I think a better spot would be in the registry watcher: https://github.com/rust-lang/docs.rs/blob/master/src/utils/daemon.rs#L18, but it should be spaced out longer than every 30 seconds. Maybe you can add a counter and every time it rolls over to 1000 run |
@jyn514 thanks for the quick reply. Line 101 in 5c7f288
It runs every hour and the action before is update of repos which probably adds garbage for Do you think it's a good spot? |
I think a more reliable approach is to count the files in |
@pietroalbini
Why would you want to redo the same logic it contains? Anything that I'm missing? |
Mostly to reduce the amount of times we shell out to |
Yeah obviously less forks but it comes with the cost of extra code to maintain which isn't really in (I don't mind implementing some watcher over the directory as you suggested, I need the practice 😄 ) |
Let's not put it there: the origin of the extra files is in the registry watcher (update_all_crates just queries the GitHub API), and I'd really prefer all the relevant code to be in one place instead of being scattered around the codebase. Yes, that thread loops every 30 seconds, but you can do something like this: let mut last_gc = Instant::now();
loop {
// Existing code
if last_gc.elapsed() > Duration::from_secs(config.registry_gc_interval) {
// Run `git gc`
last_gc = Instant::now();
}
} |
@pietroalbini @jyn514
|
I don't think there is any test for that at all. Until around a year ago docs.rs had no tests at all (one of the focuses of the past year was building a test suite), so it's not that surprising.
Yeah using |
@pietroalbini thank you for the details. |
This was implemented in #975. |
We discovered that the clone on the index managed by docs.rs reached 2k+ packfiles, which caused spikes of thousands of open FDs and 20% CPU usage every time the repository was queried. We should add code that performs
git gc
periodically.The text was updated successfully, but these errors were encountered: