Skip to content

Conversation

zhyass
Copy link
Member

@zhyass zhyass commented Feb 7, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

This PR introduces an improved implementation of Hilbert clustering, replacing the original global sorting approach with a more efficient range partition strategy. The new approach significantly reduces computational overhead and improves performance, especially for large-scale datasets.

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-refactor this PR changes the code base without new features or bugfix label Feb 7, 2025
@zhyass zhyass marked this pull request as draft February 7, 2025 14:31
@zhyass zhyass added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Feb 7, 2025
@zhyass zhyass added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Feb 10, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Feb 10, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Feb 10, 2025
@zhyass zhyass added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Feb 10, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Feb 11, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Feb 11, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Feb 11, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Mar 10, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Mar 10, 2025
@zhyass zhyass added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Mar 13, 2025
Copy link
Contributor

Docker Image for PR

  • tag: pr-17424-a8b9159-1741831967

note: this image tag is only available for internal use,
please check the internal doc for more details.

@zhyass zhyass changed the title refactor: [DO NOT MERGE] hilbert clustering refactor: hilbert clustering Mar 17, 2025
@zhyass zhyass changed the title refactor: hilbert clustering refactor: Improved Hilbert Clustering with Range Partition Mar 17, 2025
@zhyass zhyass marked this pull request as ready for review March 17, 2025 03:04
@dantengsky
Copy link
Member

dantengsky commented Mar 20, 2025

@zhyass Pushed a commit with some extra code comments, hoping they’re helpful. Please help check if there are any incorrect descriptions and revise them.

@dantengsky dantengsky merged commit 08c4f54 into databendlabs:main Mar 20, 2025
211 of 221 checks passed
loloxwg pushed a commit to loloxwg/databend that referenced this pull request Apr 3, 2025
…abs#17424)

* fix

* fix

* fix

* fix

* fix

* fix

* update

* for test

* for test

* for test

* for test

* fix

* fix

* fix

* remove m_cte

* fix

* fix

* fix

* fix

* fix

* restore m cte

* fix

* fix

* fix

* remove m_cte

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* for test

* fix

* fix

* fix

* fix

* for test

* fix

* fix memory size

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* recover

* fix

* fix

* fix

* fix

* fix

fix

fix

* fix

* fix test

* fix test

* fix test

* fix test

* add hilbert_range_index

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* chore: add some extra code comments

---------

Co-authored-by: dantengsky <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-cloud Build docker image for cloud test pr-refactor this PR changes the code base without new features or bugfix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants