Skip to content

Create guide doc for partition compaction #6512

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 35 commits into from
Mar 27, 2025
Merged
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
df10150
Purge expired postings cache items due inactivity (#6502)
alanprot Jan 10, 2025
f53d0bc
Update thanos to 4ba0ba403896 (#6503)
dsabsay Jan 12, 2025
d6de77c
Bump the actions-dependencies group across 1 directory with 2 updates…
dependabot[bot] Jan 14, 2025
b95024f
calculate # of concurrency only once at the runner (#6506)
SungJin1212 Jan 14, 2025
0d17b32
Implement partition compaction planner (#6469)
alexqyle Jan 14, 2025
5ebdb83
Add max tenant config to tenant federation (#6493)
SungJin1212 Jan 15, 2025
963d7bd
Add cleaner logic to clean partition compaction blocks and related fi…
alexqyle Jan 15, 2025
0aa9048
Update RELEASE.md (#6511)
CharlieTLe Jan 16, 2025
51772d4
update thanos version to 236777732278c64ca01c1c09d726f0f712c87164 (#6…
yeya24 Jan 16, 2025
3666e70
Fix race that can cause nil reference when using expanded postings (#…
alanprot Jan 16, 2025
22d231c
Add more op label values to cortex_query_frontend_queries_total metri…
SungJin1212 Jan 18, 2025
c0e64d5
Allow use of non-dualstack endpoints for S3 blocks storage (#6522)
sam-mcbr Jan 19, 2025
a64382d
Expose grpc client connect timeout config and default to 5s (#6523)
yeya24 Jan 20, 2025
12e8808
Hook up partition compaction end to end implementation (#6510)
alexqyle Jan 21, 2025
79adc33
Test for nil on expire expanded postings (#6521)
alanprot Jan 22, 2025
8203621
log when a request starts running in querier (#6525)
afhassan Jan 22, 2025
44032df
Update build image according to https://github.com/cortexproject/cort…
friedrichg Jan 22, 2025
6c5ce8e
Deprecate -blocks-storage.tsdb.wal-compression-enabled flag
SungJin1212 Jan 22, 2025
47af5e5
Fix test (#6537)
danielblando Jan 23, 2025
e3cc297
Mark 1.19 release in progress
CharlieTLe Jan 22, 2025
e9584c0
Prepare 1.19.0-rc.0
CharlieTLe Jan 23, 2025
83ddfb8
Revert "Prepare 1.19.0-rc.0"
CharlieTLe Jan 24, 2025
a9d69a6
Fixed blocksGroupWithPartition unable to reuse functions from blocksG…
alexqyle Jan 24, 2025
1035fa1
Remove TransferChunks gRPC method (#6543)
SungJin1212 Jan 24, 2025
cdc6781
Uupdate Ppromqlsmith (#6557)
alanprot Jan 28, 2025
550a559
Query Partial Data (#6526)
justinjung04 Jan 28, 2025
1c7157c
Add timeout for dynamodb ring kv (#6544)
yeya24 Jan 28, 2025
4ec7588
Bump the actions-dependencies group across 1 directory with 2 updates…
dependabot[bot] Jan 29, 2025
d97f610
Fix: expanded postings can cache wrong data when queries are issued "…
alanprot Jan 29, 2025
0350068
Extend ShuffleSharding on READONLY ingesters (#6517)
danielblando Jan 30, 2025
71fa878
Create guide doc for partition compaction
alexqyle Jan 15, 2025
117373a
Update docs/guides/partitioning-compactor.md
alexqyle Jan 23, 2025
a7466d1
updated doc
alexqyle Jan 30, 2025
c83f552
Merge commit 'b48f93b12ee3deb5ae251cc52cda55754f109efb' into partitio…
alexqyle Jan 31, 2025
6b8066b
clean white space
alexqyle Feb 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions docs/guides/partitioning-compactor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
title: "Use Partition Compaction in Cortex"
linkTitle: "Partition Compaction"
weight: 10
slug: partition-compaction
---

## Context

Compactor is bounded by maximum 64GB of index file size. If compaction failed due to exceeding index file size limit, partition compaction can be enabled to allow compactor compacting into multiple blocks that have index file size stays within limit.

## Enable Partition Compaction

In order to enable partition compaction, the following flag needs to be set:

```
-compactor.sharding-enabled=true # Enable sharding tenants across multiple compactor instances. This is required to enable partition compaction
-compactor.sharding-strategy=shuffle-sharding # Use Shuffle Sharding as sharding strategy. This is required to enable partition compaction
-compactor.compaction-strategy=partitioning # Use Partition Compaction as compaction strategy. To turn if off, set it to `default`
```

### Migration

There is no special migration process needed to enable partition compaction. End user could enable it by setting the above configurations all at once.

Enabling partition compaction would group previously compacted blocks (only those have time range smaller than the largest configured compaction time ranges) with uncompacted blocks and generate new compaction plans. This would group blocks having duplicated series together and those series would be deduped after compaction.

Disabling partition compaction after enabled it does not need migration either. After disabling partition compaction, compactor would group partitioned result blocks together and compact them into one block.

## Configure Partition Compaction

By default, partition compaction utilizes the following configurations and their values:

```
-compactor.partition-index-size-bytes=68719476736 # 64GB
-compactor.partition-series-count=0 # no limit
```

The default value should start partitioning result blocks when sum of index files size of parent blocks exceeds 64GB. End user could also change those two configurations. Partition compaction would always calculate partition count based on both configuration and pick the one with higher partition count.

Both configurations support to be set per tenant.

Note: `compactor.partition-series-count` is using sum of series count of all parent blocks. If parent blocks were not deduped, the result block could have fewer series than the configuration value.

## Useful Metrics

- `cortex_compactor_group_partition_count`: can be used to keep track of how many partitions being compacted for each time range.
- `cortex_compactor_group_compactions_not_planned_total`: can be used to alarm any compaction was failed to be planned due to error.
- `cortex_compact_group_compaction_duration_seconds`: can be used to monitor compaction duration of each time range compactions.
- `cortex_compactor_oldest_partition_offset`: can be used to monitor when was the oldest compaction that is still not completed.
Loading