Skip to content

One configurable cache per repository pool #464

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Sep 18, 2018
Merged

One configurable cache per repository pool #464

merged 5 commits into from
Sep 18, 2018

Conversation

kuba--
Copy link
Contributor

@kuba-- kuba-- commented Sep 10, 2018

This PR closes #440

Changes:

  • Updated go-git
  • repository interface requires Cache() cache.Object
  • one cache object per RepositoryPool

kuba-- added 2 commits September 10, 2018 10:10
Signed-off-by: kuba-- <[email protected]>
@kuba-- kuba-- added proposal proposal for new additions or changes performance Performance improvements labels Sep 10, 2018
@kuba-- kuba-- requested review from jfontan and a team September 10, 2018 11:25
Password string `short:"P" long:"password" default:"" description:"Password used for connection."`
PilosaURL string `long:"pilosa" default:"http://localhost:10101" description:"URL to your pilosa server." env:"PILOSA_ENDPOINT"`
IndexDir string `short:"i" long:"index" default:"/var/lib/gitbase/index" description:"Directory where the gitbase indexes information will be persisted." env:"GITBASE_INDEX_DIR"`
CacheSize cache.FileSize `long:"cache" default:"536870912" description:"Object cache size" env:"GITBASE_CACHE_SIZE"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you change the default to something more human-friendly? like 100 and allows only MB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah sure, how about 512MB?

return &RepositoryPool{
repositories: make(map[string]repository),
cache: cache.NewObjectLRU(maxCacheSize),
Copy link
Contributor

@jfontan jfontan Sep 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly we are using the same cache for all repositories. While this may be OK now we may have problems when partitions are used:

  • I'm not really sure of the cache implementation thread safety
  • Recent objects from one repo may be evicted by others and make it run slower
  • Concurrent use of the same cache may be slower (locking?)

I would have one cache per repo, a pool of caches that can be evicted or one cache per partition. I think we can have one per repo for now and get back to it when partitions are in use.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem of having one cache per repo is that we will not able to know the amount of memory gitbase will use. Maybe we can use as a default value 96 MiB * number of repositories. Or apart from that, maybe we can implement an LRU with two key layers, to evict keys from a specific repository. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe we can keep one cache in a pool but add mutex and prefix keys by repo id?

@@ -178,17 +188,17 @@ func (p *RepositoryPool) AddGit(path string) error {

// AddGitWithID adds a git repository to the pool. ID should be specified.
func (p *RepositoryPool) AddGitWithID(id, path string) error {
return p.Add(gitRepo(id, path))
return p.Add(gitRepo(id, path, p.cache))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't store the cache in the git/sivaRepo. If they are different instances (one per repo) it can use lots of memory.

@kuba--
Copy link
Contributor Author

kuba-- commented Sep 14, 2018

@jfontan - rebased. Lets go with the simplest approach, so far.

Copy link
Contributor

@jfontan jfontan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go with this approach and iterate over it. I agree.

@ajnavarro ajnavarro merged commit 1bb9a66 into src-d:master Sep 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance improvements proposal proposal for new additions or changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Increase go-git cache size
4 participants