-
Notifications
You must be signed in to change notification settings - Fork 140
CBG-5032 Add revcache value lock for synchronized delta generation #7903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a shared mutex (deltaLock) to the revision cache value structure to synchronize delta generation between multiple clients requesting the same fromRev. This prevents redundant delta computation when multiple clients concurrently request deltas for the same source revision.
Key changes:
- Added
sync.RWMutextorevCacheValuestruct to coordinate delta generation across concurrent requests - Modified
GetDeltafunction to acquire read lock first (checking for cached delta), then upgrade to write lock if delta generation is needed - Added comprehensive test and benchmark suite demonstrating up to 99.99% performance improvement in high-concurrency scenarios
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
db/revision_cache_lru.go |
Added deltaLock field to revCacheValue struct and exposed it through DocumentRevision |
db/revision_cache_interface.go |
Added RevCacheValueDeltaLock field to DocumentRevision struct |
db/crud.go |
Refactored GetDelta to use read/write locking pattern for synchronized delta generation |
db/database_test.go |
Added test and benchmark cases for concurrent delta generation scenarios |
db/crud.go
Outdated
| if fromRevision.BodyBytes == nil && fromRevision.Delta == nil { | ||
| return nil, nil, err | ||
| // locking ensures clients requesting deltas sharing the same from revision memoize the work to generate the delta and share the cached result | ||
| initialFromRevision.RevCacheValueDeltaLock.RLock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have separate synchronization for actually updating the delta in the rev cache (in UpdateDelta) - I believe the intention is that DocumentRevision.Delta points to an immutable delta, and in UpdateDelta a new delta is stored and referenced. Given that, do we actually need to acquire a read lock on RevCacheValueDeltaLock, if the only intention is to prevent concurrent delta computation? Is it sufficient to just acquire the write lock before starting the work to fetch/compute the delta?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this analysis, thanks. Dropped the lock to a regular mutex and dropped the read lock before we move on to write lock when needed.
db/crud.go
Outdated
| var fromRevisionForDiff DocumentRevision | ||
|
|
||
| isAuthorized, redactedBody := db.authorizeUserForChannels(docID, toRev, fromRevision.CV, fromRevision.Delta.ToChannels, fromRevision.Delta.ToDeleted, encodeRevisions(ctx, docID, fromRevision.Delta.RevisionHistory)) | ||
| // check the revcache again for a delta - it's possible another writer generated it while we were waiting for the write lock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the only performance overhead of this change is this additional rev cache Get the first time the delta is computed. In a cases where only one reader is using the delta this will be a small amount of additional overhead, but I expect the cache fetch (which should always hit) will be trivial compared to delta computation in the first place. Agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, agreed.
I did write BenchmarkDeltaSyncSingleClientCachePopulation just to check, which just runs GetDelta over 1000 documents and there was no difference between main and with the double cache fetch.
~4ms per GetDelta computation in both cases and no difference in heap allocs.
$ benchstat main-1000docs1client.bench 5032-1000docs1client.bench
goos: darwin
goarch: arm64
pkg: github.com/couchbase/sync_gateway/db
cpu: Apple M1 Pro
│ main-1000docs1client.bench │ 5032-1000docs1client.bench │
│ sec/op │ sec/op vs base │
DeltaSyncSingleClientCachePopulation/100KBDoc-10 4.173 ± ∞ ¹ 4.176 ± ∞ ¹ ~ (p=1.000 n=1) ²
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05
│ main-1000docs1client.bench │ 5032-1000docs1client.bench │
│ B/op │ B/op vs base │
DeltaSyncSingleClientCachePopulation/100KBDoc-10 7.199Gi ± ∞ ¹ 7.200Gi ± ∞ ¹ ~ (p=1.000 n=1) ²
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05
│ main-1000docs1client.bench │ 5032-1000docs1client.bench │
│ allocs/op │ allocs/op vs base │
DeltaSyncSingleClientCachePopulation/100KBDoc-10 127.4k ± ∞ ¹ 127.4k ± ∞ ¹ ~ (p=1.000 n=1) ²
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05
CBG-5032
Adds a new shared mutex on RevCache Value that is shared by concurrent GetDelta callers to syncronise delta generation between multiple clients on the same fromRev.
Benchmark:
Integration Tests
GSI=true,xattrs=truehttps://jenkins.sgwdev.com/job/SyncGatewayIntegration/181/