Skip to content

Conversation

Avid29
Copy link
Contributor

@Avid29 Avid29 commented Oct 10, 2025

The DBScan step runs after KMeans to merge clusters that are too similar.

KMeans cannot be refined on its own to not have this issue, since it simply clusters to K points. Determining the right number for K is no better than just merging the results with another clustering method.

@Arlodotexe Arlodotexe requested a review from Copilot October 13, 2025 17:01
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Enhances the ColorPaletteSampler by integrating DBScan clustering to merge similar KMeans results, addressing the limitation where KMeans clusters to K points regardless of color similarity.

  • Added DBScan clustering post-processing to merge KMeans clusters that are too similar
  • Introduced a configurable merge distance parameter for similarity threshold
  • Modified weight calculation to properly handle cluster merging

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
ColorPaletteSampler.cs Added DBScan post-processing step with merge distance parameter and updated weight handling
ColorPaletteSampler.KMeans.cs Changed method visibility from private to internal to support DBScan integration
ColorPaletteSampler.DBScan.cs Implemented complete DBScan clustering algorithm as a ref struct with weighted centroid calculation

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

public int[] PointClusterIds { get; }

/// <summary>
/// Gets epsilon squared. Where epslion is the max distance to consider two points connected.
Copy link

Copilot AI Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'epslion' to 'epsilon'.

Suggested change
/// Gets epsilon squared. Where epslion is the max distance to consider two points connected.
/// Gets epsilon squared. Where epsilon is the max distance to consider two points connected.

Copilot uses AI. Check for mistakes.

public double Epsilon2 { get; }

/// <summary>
/// Gets the miniumum number of points required to make a core point.
Copy link

Copilot AI Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'miniumum' to 'minimum'.

Suggested change
/// Gets the miniumum number of points required to make a core point.
/// Gets the minimum number of points required to make a core point.

Copilot uses AI. Check for mistakes.

return seeds;
}

private DBScan(Span<Vector3> points, Span<float> weights, double epsilon, int minPoints)
Copy link

Copilot AI Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The epsilon parameter is declared as double but used in a context where float precision would be sufficient and more consistent with the mergeDistance parameter (0.12f). Consider using float for consistency.

Copilot uses AI. Check for mistakes.

var seeds = new Queue<int>();
for (int i = 0; i < Points.Length; i++)
{
if (Vector3.DistanceSquared(origin, Points[i]) <= Epsilon2)
Copy link

Copilot AI Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates an O(n²) algorithm for seed finding. For large datasets, consider using spatial data structures like KD-trees or grid-based approaches to improve performance.

Copilot uses AI. Check for mistakes.

Copy link
Contributor Author

@Avid29 Avid29 Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is run after KMeans, there are only 8 points. Building a KD-Tree is far more effort than it's worth, and may introduce overhead with a net loss

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant