-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
What version of Go are you using (go version
)?
$ go version go version go1.11.4 linux/amd64
What version of Badger are you using?
v1.6
Does this issue reproduce with the latest master?
Yes
Recently I saw a increasing memory usage as well as data accumulation of WAL in dgraph raft progress. Then I export KVs and find most of which are deleted records. As dgraph raft record all entities and delete older ones during snapshot, most of these should be removed during SST table compaction.
After a view of Badger compaction strategy, I realize keys can only be removed when compact to a non-overlay next level.
Review func compactBuildTables in file github.com/dgraph-io/badger/levels.go
func (s *levelsController) compactBuildTables(
lev int, cd compactDef) ([]*table.Table, func() error, error) {
topTables := cd.top
botTables := cd.bot
var hasOverlap bool
{
kr := getKeyRange(cd.top...)
for i, lh := range s.levels {
if i <= lev { // Skip upper levels.
continue
}
lh.RLock()
left, right := lh.overlappingTables(levelHandlerRLocked{}, kr)
lh.RUnlock()
if right-left > 0 {
hasOverlap = true
break
}
}
}
...
There's hasOverlap check that works.
As preserved keys like !badger!head write to every single level0 table and inherited by all lower levels, there tends to be hasOverlap, even for neighborhoods that are being compacted with empty lower levels.
For example:
L0: sst5 sst6 sst7
L1:sst4
L2:empty
...
We want to compact sst5, sst6 and sst7 in level0 to level1, as sst4 in L1 overlap with tables in L0, the result will be:
L0: empty
L1:sst8, sst9
L2:empty
...
Assuming that there are some deleting records in sst5/sst6/sst7, those records will write back to sst8/sst9 and it's hard to remove these records.
I think it could be improved by check like codes below.
func (s *levelsController) checkOverlap(lev int, tables []*table.Table) bool {
var hasOverlap bool
kr := getKeyRange(tables)
for i, lh := range s.levels {
if i <= lev { // Skip upper levels.
continue
}
lh.RLock()
left, right := lh.overlappingTables(levelHandlerRLocked{}, kr)
lh.RUnlock()
if right-left > 0 {
hasOverlap = true
break
}
}
return hasOverlap
}
func (s *levelsController) compactBuildTables(
lev int, cd compactDef) ([]*table.Table, func() error, error) {
topTables := cd.top
botTables := cd.bot
var hasOverlap bool
hasOverlap = s.checkOverlap(lev+1, cd.top) || s.checkOverlap(lev+1, cd.bot)
...
Here we check if L0 and L1 overlap with L2 etc. while omitting overlap between L0 and L1, as they will compact later.
Please let me know if it works or any unknown faults.
Thank you!