-
Notifications
You must be signed in to change notification settings - Fork 739
schemeshard: fix stats processing op1, part 3 #29813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
schemeshard: fix stats processing op1, part 3 #29813
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes disk space aggregation for column tables by refactoring how disk space usage deltas are processed. The key issue addressed is that column table statistics were not being properly aggregated at the subdomain level after a schemeshard reboot.
Key changes:
- Removed unused
diskSpaceUsageDeltaparameter fromUpdateTableStatsmethods for column tables - Changed
AggrDiskSpaceUsageto treat disk space usage delta as an ordered vector instead of a map, with total space at index 0 and per-pool deltas at subsequent indices - Added comprehensive test coverage for disk space usage with various table configurations (regular tables, column tables in stores, standalone column tables)
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| ydb/core/tx/schemeshard/ut_subdomain/ya.make | Added dependencies for columnshard testing utilities |
| ydb/core/tx/schemeshard/ut_subdomain/ut_subdomain.cpp | Added extensive test coverage for disk space usage tracking across different table types and configurations |
| ydb/core/tx/schemeshard/schemeshard_info_types.h | Removed diskSpaceUsageDelta parameter from UpdateTableStats signature |
| ydb/core/tx/schemeshard/schemeshard_info_types.cpp | Refactored AggrDiskSpaceUsage to treat delta as ordered vector; simplified UpdateTableStats implementation |
| ydb/core/tx/schemeshard/schemeshard__table_stats.cpp | Updated call site to match new UpdateTableStats signature |
| ydb/core/tx/schemeshard/olap/table/table.h | Updated UpdateTableStats signature and implementation for column tables |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
|
🟢 |
cd0003b to
16f719b
Compare
|
⚪
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
|
⚪ ⚪ Ya make output | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
- fix disk space aggregation for column tables - make subdomain and storage pool kind aggregation levels separation more robust - add tests on subdomain level aggregation
Fix disk space aggregation for column tables. Make subdomain and storage pool kind aggregation levels separation more robust. Add tests on subdomain level aggregation.
Fix disk space aggregation for column tables. Make subdomain and storage pool kind aggregation levels separation more robust. Add tests on subdomain level aggregation.
Fix disk space aggregation for column tables. Make subdomain and storage pool kind aggregation levels separation more robust. Add tests on subdomain level aggregation.
Fix disk space aggregation for column tables. Make subdomain and storage pool kind aggregation levels separation more robust. Add tests on subdomain level aggregation.
Cherry-pick from `main`: - 626410b, #27165 - fe49740, #28834 - 3950ae8, #29664 - 585a708, #29813 Streamline and optimize datashard statistics processing. Profile guided optimizations of PersistSingleStats() (in synthetic test). Total gain is around 30%. - remove unreasonable iteration over entire ShardInfos - single Now() timestamp for entire stats batch - optimize number of lookups - stop building now unnecessary storage pool kind mappings - remove table/store aggregated stats copying - collect ExternalBlobsEnabled only on PartitionConfig change - replace ETxType->CounterId map with absl::flat_hash_map - remove extra OpType->TxType lookup - remove call to GetMainTableForIndex for not-index-table shards
Cherry-pick from `main`: - 626410b, #27165 - fe49740, #28834 - 3950ae8, #29664 - 585a708, #29813 Streamline and optimize datashard statistics processing. Profile guided optimizations of PersistSingleStats() (in synthetic test). Total gain is around 30%. - remove unreasonable iteration over entire ShardInfos - single Now() timestamp for entire stats batch - optimize number of lookups - stop building now unnecessary storage pool kind mappings - remove table/store aggregated stats copying - collect ExternalBlobsEnabled only on PartitionConfig change - replace ETxType->CounterId map with absl::flat_hash_map - remove extra OpType->TxType lookup - remove call to GetMainTableForIndex for not-index-table shards
Cherry-pick from `main`: - 626410b, #27165 - fe49740, #28834 - 3950ae8, #29664 - 585a708, #29813 Streamline and optimize datashard statistics processing. Profile guided optimizations of PersistSingleStats() (in synthetic test). Total gain is around 30%. - remove unreasonable iteration over entire ShardInfos - single Now() timestamp for entire stats batch - optimize number of lookups - stop building now unnecessary storage pool kind mappings - remove table/store aggregated stats copying - collect ExternalBlobsEnabled only on PartitionConfig change - replace ETxType->CounterId map with absl::flat_hash_map - remove extra OpType->TxType lookup - remove call to GetMainTableForIndex for not-index-table shards
Fix disk space aggregation for column tables. Make subdomain and storage pool kind aggregation levels separation more robust. Add tests on subdomain level aggregation.
Cherry-pick from `main`: - 626410b, #27165 - fe49740, #28834 - 3950ae8, #29664 - 585a708, #29813 Streamline and optimize datashard statistics processing. Profile guided optimizations of PersistSingleStats() (in synthetic test). Total gain is around 30%. - remove unreasonable iteration over entire ShardInfos - single Now() timestamp for entire stats batch - optimize number of lookups - stop building now unnecessary storage pool kind mappings - remove table/store aggregated stats copying - collect ExternalBlobsEnabled only on PartitionConfig change - replace ETxType->CounterId map with absl::flat_hash_map - remove extra OpType->TxType lookup - remove call to GetMainTableForIndex for not-index-table shards
Cherry-pick from `main`: - 626410b, #27165 - fe49740, #28834 - 3950ae8, #29664 - 585a708, #29813 Streamline and optimize datashard statistics processing. Profile guided optimizations of PersistSingleStats() (in synthetic test). Total gain is around 30%. - remove unreasonable iteration over entire ShardInfos - single Now() timestamp for entire stats batch - optimize number of lookups - stop building now unnecessary storage pool kind mappings - remove table/store aggregated stats copying - collect ExternalBlobsEnabled only on PartitionConfig change - replace ETxType->CounterId map with absl::flat_hash_map - remove extra OpType->TxType lookup - remove call to GetMainTableForIndex for not-index-table shards
Fix-up to:
Fix disk space aggregation for column tables.
Separate subdomain and storage pool kind aggregation level more robust.
Add tests on subdomain level aggregation.