Rendezvous hashing filesystem cache #709

ianton-ru · 2025-04-02T12:31:10Z

Improvement object storage cache locality (#708) with rendezvous hashing.

https://en.wikipedia.org/wiki/Rendezvous_hashing

Main change in StorageObjectStorageStableTaskDistributor::getReplicaForFile method.
With original code distribution is not stable when host from beginning of cluster node list is gone or when cluster changes node order by some reason.
With rendezvous hashing best node based on node address (host:port) instead of node number. But a little bit heavy - need do calculate hash N times, one for each node.

altinity-robot · 2025-04-02T13:05:44Z

This is an automated comment for commit 6092c08 with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Check name	Description	Status
Builds	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	⏳ pending
Integration tests	The integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests	❌ failure
Stateless tests	Runs stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	❌ failure
Stress test	Runs stateless functional tests concurrently from several clients to detect concurrency-related errors	❌ failure

Successful checks

Check name	Description	Status
Stateful tests	Runs stateful functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	✅ success

arthurpassos

The Rendezvous part is mostly ok, just some small questions.

I am just a bit concerned because ClickHouse#77326 hasn't been merged yet and I believe some commits are missing. Do you plan to cherry-pick the most recent ones as well?

arthurpassos · 2025-04-06T13:17:03Z

src/Storages/IStorageCluster.cpp

+    std::vector<std::string> ids_of_hosts;
+    for (const auto & shard : cluster->getShardsInfo())
+    {
+        if (shard.per_replica_pools.size() < 1)


per_replica_pools.empty()

arthurpassos · 2025-04-06T13:46:33Z

src/Storages/ObjectStorage/StorageObjectStorageStableTaskDistributor.cpp

+
+size_t StorageObjectStorageStableTaskDistributor::getReplicaForFile(const String & file_path)
+{
+    if (!ids_of_nodes.has_value())


In which case would it not have a value? And why is empty != nullopt?

It must always have value, in ReadFromCluster::createExtension it is called with ids_of_nodes always.

About nullopt - it is the same:

if (var) if (var.has_value()) if (var != std::nullopt)

First and second are synonyms - https://en.cppreference.com/w/cpp/utility/optional/operator_bool
Third - because comparation with nullopt calls bool operator:

/usr/include/c++/11$ grep -B 5 -A 5 'operator==' optional ... // Comparisons with nullopt. template<typename _Tp> constexpr bool operator==(const optional<_Tp>& __lhs, nullopt_t) noexcept { return !__lhs; } ... template<typename _Tp> constexpr bool operator==(nullopt_t, const optional<_Tp>& __rhs) noexcept { return !__rhs; }

Sorry, I didn't explain my question correctly. I was asking why not just have a vector and check if it is empty instead of an optional of vector

What I mean is that instead of doing:

std::optional<std::vector<std::string>> ids_of_nodes; ... if (!ids_of_nodes.has_value())

you simply do:

std::vector<std::string> ids_of_nodes; ... if (ids_of_nodes.empty())

I mean, it is ok to keep it as optional, just thought it could be simpler

arthurpassos · 2025-04-06T13:47:30Z

src/Storages/ObjectStorage/StorageObjectStorageStableTaskDistributor.cpp

+        return 0;
+
+    /// Trivial case
+    if (ids_of_nodes.value().size() < 2)


Nitpick: extract ids_of_nodes into a non-optonal to avoid de-referencing all the time

This is optional in IStorageCluster because all except StorageObjectStorageCluster do not use it.
To make it non-optional need to make a copy of list inside class. Or use smart pointer, than need to check if pointer has value.
Remove some dereferencings.

can't you just do the below inside this method?

auto ids_of_nodes = ids_of_nodes_optional.value_or_die()

and use ids_of_nodes instead of repeating ids_of_nodes.value()?

Ah, you updated the PR with that here: 46fe767#diff-68aa420a604c13765878c1fb39270f50ae0757b9b2f1b6609743632d2c7d0770R44.

that's ok then

arthurpassos · 2025-04-06T13:49:08Z

src/Storages/ObjectStorage/StorageObjectStorageStableTaskDistributor.cpp

+    /// Rendezvous hashing
+    size_t best_id = 0;
+    UInt64 best_weight = sipHash64(ids_of_nodes.value()[0] + file_path);
+    for (size_t id = ids_of_nodes.value().size() - 1; id > 0; --id)


Why reverse?

Change to direct order to avoid confusing.

arthurpassos

LGTM

It might be a good idea to add a setting to turn on/ off this behavior just to be on the safe side?

But it looks ok as it is.

…icas to improve cache locality

…system-cache Rendezvous hashing filesystem cache

…_hashing 25.3 Antalya port of #709, #760 - Rendezvous hashing

…us_hashing 25.6.5 Antalya port of #709, #760, #866 - Rendezvous hashing

Enmk mentioned this pull request Apr 2, 2025

Change object storage cluster table functions to prefer specific repl… #708

Closed

ianton-ru changed the title ~~Feature/rendezvous hashing filesystem cache~~ Rendezvous hashing filesystem cache Apr 3, 2025

arthurpassos reviewed Apr 6, 2025

View reviewed changes

arthurpassos previously approved these changes Apr 6, 2025

View reviewed changes

adikus and others added 4 commits April 7, 2025 11:18

Change object storage cluster table functions to prefer specific repl…

b5b7126

…icas to improve cache locality

Improve object storage cache locality with rendezvous hashing

7d96b64

Properly initialize replica info wherever we use task iterator

0faded9

Fixes after review

763c860

ianton-ru dismissed arthurpassos’s stale review via 763c860 April 7, 2025 09:18

ianton-ru force-pushed the feature/rendezvous-hashing-filesystem-cache branch from 6092c08 to 763c860 Compare April 7, 2025 09:18

ianton-ru changed the base branch from antalya to antalya-25.2 April 7, 2025 09:19

Fix after rebase

cdfdd26

Enmk changed the base branch from antalya-25.2 to antalya April 7, 2025 11:51

MyroTk added the antalya-25.2.2 Planned for 25.2.2 release label Apr 7, 2025

Merge branch 'antalya' into feature/rendezvous-hashing-filesystem-cache

420b584

ianton-ru force-pushed the feature/rendezvous-hashing-filesystem-cache branch from 95a407b to 1c76962 Compare April 8, 2025 12:32

Add test for cache locality

d6198a4

ianton-ru force-pushed the feature/rendezvous-hashing-filesystem-cache branch from 1c76962 to d6198a4 Compare April 8, 2025 12:48

Merge branch 'antalya' into feature/rendezvous-hashing-filesystem-cache

e4b51d1

arthurpassos approved these changes Apr 8, 2025

View reviewed changes

Enmk merged commit 21f3cbd into antalya Apr 8, 2025
220 of 316 checks passed

vzakaznikov added antalya antalya-25.2 labels May 22, 2025

ianton-ru pushed a commit that referenced this pull request May 23, 2025

Merge pull request #709 from Altinity/feature/rendezvous-hashing-file…

76c1123

…system-cache Rendezvous hashing filesystem cache

ianton-ru mentioned this pull request May 23, 2025

25.3 Antalya port of #709, #760 - Rendezvous hashing #797

Merged

svb-alt mentioned this pull request May 23, 2025

Project Antalya Roadmap 2025 - Real-Time Data Lakes #804

Open

36 tasks

Enmk added a commit that referenced this pull request May 29, 2025

Merge pull request #797 from Altinity/feature/antalya-25.3/rendezvous…

fefd2c0

…_hashing 25.3 Antalya port of #709, #760 - Rendezvous hashing

ianton-ru pushed a commit that referenced this pull request Jul 17, 2025

Merge pull request #797 from Altinity/feature/antalya-25.3/rendezvous…

672bc2d

…_hashing 25.3 Antalya port of #709, #760 - Rendezvous hashing

ianton-ru mentioned this pull request Jul 17, 2025

25.6 Antalya port of #709, #760 - Rendezvous hashing #923

Closed

13 tasks

ianton-ru pushed a commit that referenced this pull request Aug 6, 2025

Merge pull request #797 from Altinity/feature/antalya-25.3/rendezvous…

15148db

…_hashing 25.3 Antalya port of #709, #760 - Rendezvous hashing

ianton-ru mentioned this pull request Aug 6, 2025

25.6.5 Antalya port of #709, #760, #866 - Rendezvous hashing #952

Merged

13 tasks

Enmk added a commit that referenced this pull request Sep 9, 2025

Merge pull request #952 from Altinity/feature/antalya-25.6.5/rendezvo…

a379172

…us_hashing 25.6.5 Antalya port of #709, #760, #866 - Rendezvous hashing

ianton-ru pushed a commit that referenced this pull request Oct 2, 2025

Merge pull request #952 from Altinity/feature/antalya-25.6.5/rendezvo…

0693e71

…us_hashing 25.6.5 Antalya port of #709, #760, #866 - Rendezvous hashing

ianton-ru mentioned this pull request Oct 2, 2025

25.8 Antalya ports, improvements for cluster requests #1059

Open

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rendezvous hashing filesystem cache #709

Rendezvous hashing filesystem cache #709

Uh oh!

ianton-ru commented Apr 2, 2025 •

edited

Loading

Uh oh!

altinity-robot commented Apr 2, 2025 •

edited

Loading

Uh oh!

arthurpassos left a comment

Uh oh!

arthurpassos Apr 6, 2025

Uh oh!

arthurpassos Apr 6, 2025

Uh oh!

ianton-ru Apr 6, 2025

Uh oh!

arthurpassos Apr 6, 2025

Uh oh!

arthurpassos Apr 6, 2025

Uh oh!

arthurpassos Apr 6, 2025

Uh oh!

ianton-ru Apr 6, 2025

Uh oh!

arthurpassos Apr 6, 2025

Uh oh!

arthurpassos Apr 6, 2025

Uh oh!

arthurpassos Apr 6, 2025

Uh oh!

ianton-ru Apr 6, 2025

Uh oh!

arthurpassos left a comment

Uh oh!

Uh oh!

Uh oh!

Rendezvous hashing filesystem cache #709

Rendezvous hashing filesystem cache #709

Uh oh!

Conversation

ianton-ru commented Apr 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

altinity-robot commented Apr 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arthurpassos left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arthurpassos left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ianton-ru commented Apr 2, 2025 •

edited

Loading

altinity-robot commented Apr 2, 2025 •

edited

Loading