Skip to content

Conversation

ianton-ru
Copy link

@ianton-ru ianton-ru commented Apr 2, 2025

Improvement object storage cache locality (#708) with rendezvous hashing.

https://en.wikipedia.org/wiki/Rendezvous_hashing

Main change in StorageObjectStorageStableTaskDistributor::getReplicaForFile method.
With original code distribution is not stable when host from beginning of cluster node list is gone or when cluster changes node order by some reason.
With rendezvous hashing best node based on node address (host:port) instead of node number. But a little bit heavy - need do calculate hash N times, one for each node.

@altinity-robot
Copy link
Collaborator

altinity-robot commented Apr 2, 2025

This is an automated comment for commit 6092c08 with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Check nameDescriptionStatus
BuildsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS⏳ pending
Integration testsThe integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests❌ failure
Stateless testsRuns stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc❌ failure
Stress testRuns stateless functional tests concurrently from several clients to detect concurrency-related errors❌ failure
Successful checks
Check nameDescriptionStatus
Stateful testsRuns stateful functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc✅ success

@ianton-ru ianton-ru changed the title Feature/rendezvous hashing filesystem cache Rendezvous hashing filesystem cache Apr 3, 2025
Copy link
Collaborator

@arthurpassos arthurpassos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Rendezvous part is mostly ok, just some small questions.

I am just a bit concerned because ClickHouse#77326 hasn't been merged yet and I believe some commits are missing. Do you plan to cherry-pick the most recent ones as well?

std::vector<std::string> ids_of_hosts;
for (const auto & shard : cluster->getShardsInfo())
{
if (shard.per_replica_pools.size() < 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per_replica_pools.empty()


size_t StorageObjectStorageStableTaskDistributor::getReplicaForFile(const String & file_path)
{
if (!ids_of_nodes.has_value())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case would it not have a value? And why is empty != nullopt?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It must always have value, in ReadFromCluster::createExtension it is called with ids_of_nodes always.

About nullopt - it is the same:

if (var)
if (var.has_value())
if (var != std::nullopt)

First and second are synonyms - https://en.cppreference.com/w/cpp/utility/optional/operator_bool
Third - because comparation with nullopt calls bool operator:

/usr/include/c++/11$ grep -B 5 -A 5 'operator==' optional
...
  // Comparisons with nullopt.
  template<typename _Tp>
    constexpr bool
    operator==(const optional<_Tp>& __lhs, nullopt_t) noexcept
    { return !__lhs; }
...
  template<typename _Tp>
    constexpr bool
    operator==(nullopt_t, const optional<_Tp>& __rhs) noexcept
    { return !__rhs; }

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't explain my question correctly. I was asking why not just have a vector and check if it is empty instead of an optional of vector

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that instead of doing:

std::optional<std::vector<std::string>> ids_of_nodes;
...
    if (!ids_of_nodes.has_value())

you simply do:

std::vector<std::string> ids_of_nodes;
...
    if (ids_of_nodes.empty())

I mean, it is ok to keep it as optional, just thought it could be simpler

return 0;

/// Trivial case
if (ids_of_nodes.value().size() < 2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: extract ids_of_nodes into a non-optonal to avoid de-referencing all the time

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is optional in IStorageCluster because all except StorageObjectStorageCluster do not use it.
To make it non-optional need to make a copy of list inside class. Or use smart pointer, than need to check if pointer has value.
Remove some dereferencings.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't you just do the below inside this method?

auto ids_of_nodes = ids_of_nodes_optional.value_or_die()

and use ids_of_nodes instead of repeating ids_of_nodes.value()?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you updated the PR with that here: 46fe767#diff-68aa420a604c13765878c1fb39270f50ae0757b9b2f1b6609743632d2c7d0770R44.

that's ok then

/// Rendezvous hashing
size_t best_id = 0;
UInt64 best_weight = sipHash64(ids_of_nodes.value()[0] + file_path);
for (size_t id = ids_of_nodes.value().size() - 1; id > 0; --id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why reverse?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to direct order to avoid confusing.

arthurpassos
arthurpassos previously approved these changes Apr 6, 2025
Copy link
Collaborator

@arthurpassos arthurpassos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

It might be a good idea to add a setting to turn on/ off this behavior just to be on the safe side?

But it looks ok as it is.

@ianton-ru ianton-ru force-pushed the feature/rendezvous-hashing-filesystem-cache branch from 6092c08 to 763c860 Compare April 7, 2025 09:18
@ianton-ru ianton-ru changed the base branch from antalya to antalya-25.2 April 7, 2025 09:19
@Enmk Enmk changed the base branch from antalya-25.2 to antalya April 7, 2025 11:51
@MyroTk MyroTk added the antalya-25.2.2 Planned for 25.2.2 release label Apr 7, 2025
@ianton-ru ianton-ru force-pushed the feature/rendezvous-hashing-filesystem-cache branch from 95a407b to 1c76962 Compare April 8, 2025 12:32
@ianton-ru ianton-ru force-pushed the feature/rendezvous-hashing-filesystem-cache branch from 1c76962 to d6198a4 Compare April 8, 2025 12:48
@Enmk Enmk merged commit 21f3cbd into antalya Apr 8, 2025
220 of 316 checks passed
ianton-ru pushed a commit that referenced this pull request May 23, 2025
…system-cache

Rendezvous hashing filesystem cache
Enmk added a commit that referenced this pull request May 29, 2025
…_hashing

25.3 Antalya port of #709, #760 - Rendezvous hashing
ianton-ru pushed a commit that referenced this pull request Jul 17, 2025
…_hashing

25.3 Antalya port of #709, #760 - Rendezvous hashing
ianton-ru pushed a commit that referenced this pull request Aug 6, 2025
…_hashing

25.3 Antalya port of #709, #760 - Rendezvous hashing
Enmk added a commit that referenced this pull request Sep 9, 2025
…us_hashing

25.6.5 Antalya port of #709, #760, #866 - Rendezvous hashing
ianton-ru pushed a commit that referenced this pull request Oct 2, 2025
…us_hashing

25.6.5 Antalya port of #709, #760, #866 - Rendezvous hashing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants