Implementing KMedoids in scikit-learn-extra #12

znd4 · 2019-04-29T03:33:50Z

Based on the recommendation of a few people, I'm porting the KMedoids implementaion from scikit-learn #1109.

I think I'm missing the documentation atm, but I'll have to take a look at that later (I don't really have any experience with restructured text.

…it-learn-extra into kmedoids

znd4 · 2019-04-29T21:58:46Z

Hmm. I could've sworn that I'd written tests to cover _kpp_init. I'll have to check that out

znd4 · 2019-05-01T14:36:20Z

This is my first time working with a code coverage tool. I thought that my most recent commit should've fixed the coverage issues involving the _kpp_init method. Could someone point me in the direction of how I might fix these code coverage issues?

rth

Thanks @zdog234 ! I'll try to review this in detail soon.

Could you also please add the section from the user guide and the example from the original PR?

rth · 2019-05-07T15:25:07Z

sklearn_extra/cluster/k_medoids_.py

+
+        return medoids
+
+    def _kpp_init(self, D, n_clusters, random_state_, n_local_trials=None):


It's strange indeed that this function is reported as not being run in coverage while it is explicitly run in test_kmedoids_pp will try to have a closer look soon.

znd4 · 2019-05-11T23:38:25Z

I tried to add some of the documentation, but I don't understand how to add mathjax/mathjs, so all of the math is broken when I try to build.

Has a decision been made w/ this project as to which one to use?

And how do I go about adding that? Is there something I can just sudo apt install to properly build the documentation with math? (Ubuntu)

EDIT: I added the documentation; I just don't know if the math will build properly. This is my first time working with .rst

jnothman · 2019-05-21T08:22:34Z

See scikit-learn's doc/conf.py where the sphinx.ext.mathjax is included, and mathjax_path is specified.

rth

Looks good overall. A few minor comments,

could you please add the KMedoids estimator to https://github.com/scikit-learn-contrib/scikit-learn-extra/blob/master/sklearn_extra/tests/test_common.py to the list of estimators on which check_estimator is run
please also add the example from the original PR under examples/
if you could rename the file k_medoid_.py to _k_medoids.py so that it becomes a private export when not used as from sklearn_extra.cluster import KMedoids that would be great.

The docs is rendereed here https://25-173284824-gh.circle-artifacts.com/0/doc/user_guide.html no need to worry about formatting to much, we can fix that later.

liufsd · 2019-05-27T10:44:56Z

@zdog234 'commented 15 days ago • '
~ long time ~
😢

@rth @jnothman

liufsd · 2019-05-30T03:07:52Z

Test this code fail when more then two dimensionality:
（two dimensionality, the k-means = KMedoids, but six not same）

import warnings
import numpy as np
from unittest import mock
from scipy.sparse import csc_matrix

from sklearn.datasets import load_iris
from sklearn.metrics.pairwise import PAIRWISE_DISTANCE_FUNCTIONS
from sklearn.metrics.pairwise import euclidean_distances
from sklearn.utils.testing import assert_array_equal, assert_equal
from sklearn.utils.testing import assert_raise_message, assert_warns_message
from sklearn.utils.testing import assert_allclose

from sklearn_extra.cluster import KMedoids
from sklearn.cluster import KMeans

def testTwoD():
    n_clusters = 2
    metric = 'euclidean'
    Xnaive = np.array([[1, 2], [1, 4], [1, 0],[10, 2], [10, 4], [10, 0]])

    model = KMedoids(n_clusters=n_clusters, random_state=0, metric=metric).fit(Xnaive)
    print(model.cluster_centers_)
    print(model.labels_)

    kmeans = KMeans(n_clusters=n_clusters, random_state=0,algorithm = 'auto',).fit(Xnaive)
    print(kmeans.cluster_centers_)
    print(kmeans.labels_)
 

def testSixD():
    n_clusters = 7
    metric = 'euclidean'
    Xnaive = np.array([[0,1,1,1,1,0], [0,1,1,1,1,1], [1,0,1,1,0,1], [1,0,0,0,0,0], [1,1,0,1,1,0], 
    [0,1,1,0,0,0], [1,0,0,1,1,1], [1,1,1,1,0,0], [0,0,1,1,0,1], [0,1,0,0,0,0], 
    [0,0,0,0,1,1], [0,1,0,1,1,1], [0,0,1,0,1,1], [0,1,0,1,0,1], [0,0,0,0,0,0],
     [0,1,0,0,1,0], [1,0,0,1,1,0], [1,0,0,1,0,1], [1,0,1,0,1,0], [0,1,0,1,0,0], 
     [1,1,1,1,1,0], [0,0,0,1,1,0], [1,1,1,1,0,1], [0,1,1,0,0,1], [0,0,1,0,0,1], 
     [0,0,0,1,0,0], [0,0,1,0,0,0], [1,0,1,0,0,1], [1,0,1,0,0,0], [1,0,0,0,1,0], 
     [1,0,1,0,1,1], [1,1,1,0,0,1], [1,0,1,1,1,0], [1,0,0,0,0,1], [1,1,0,1,0,1], 
     [0,0,0,0,0,1], [0,0,1,0,1,0], [0,0,1,1,0,0], [1,1,1,0,1,0], [0,1,0,1,1,0], 
     [0,1,1,0,1,1], [1,1,0,1,1,1], [0,1,0,0,1,1], [0,1,0,0,0,1], [1,1,1,0,1,1], 
     [1,0,0,0,1,1], [0,1,1,1,0,1], [1,1,1,0,0,0], [1,0,0,1,0,0], [1,1,0,0,0,1], 
     [1,1,0,0,1,0], [1,1,1,1,1,1], [0,1,1,1,0,0], [1,1,0,0,0,0], [0,1,1,0,1,0], 
     [0,0,0,1,0,1], [1,0,1,1,0,0], [0,0,0,1,1,1], [1,0,1,1,1,1], [1,1,0,1,0,0], 
     [0,0,0,0,1,0], [1,1,0,0,1,1], [0,0,1,1,1,0], [0,0,1,1,1,1]])
  
    model = KMedoids(n_clusters=n_clusters, random_state=0, metric=metric).fit(Xnaive)
    print(model.cluster_centers_)
    print(model.labels_)

    kmeans = KMeans(n_clusters=n_clusters, random_state=0,algorithm = 'auto',).fit(Xnaive)
    print(kmeans.cluster_centers_)
    print(kmeans.labels_)


print("testTwoD")
testTwoD()   

print("testSixD")
testSixD()

result:

➜  test git:(master) ✗ python3 test_k_medoids.py
testTwoD
[[ 1  2]
 [10  2]]
[0 0 0 1 1 1]
[[10.  2.]
 [ 1.  2.]]
[1 1 1 0 0 0]
testSixD
[[1 1 1 0 1 0]
 [1 0 0 0 0 1]
 [0 1 0 0 0 1]
 [1 0 0 1 1 1]
 [0 1 1 1 0 1]
 [0 1 1 0 1 1]
 [0 1 0 0 1 1]]
[0 4 1 1 0 0 3 0 4 2 6 6 5 2 1 6 3 1 0 2 0 3 4 2 1 1 0 1 0 0 0 0 0 1 1 1 0
 4 0 6 5 3 6 2 0 1 4 0 1 1 0 0 4 0 0 1 0 3 3 0 6 6 0 3]
[[1.         0.57142857 0.71428571 0.64285714 0.35714286 0.14285714]
 [0.76923077 0.23076923 0.         0.38461538 0.61538462 0.76923077]
 [0.14285714 0.85714286 0.14285714 0.85714286 0.14285714 0.85714286]
 [0.1        0.6        0.2        0.2        0.8        0.1       ]
 [0.44444444 0.55555556 1.         0.         0.44444444 0.88888889]
 [0.4        0.6        1.         1.         1.         0.8       ]
 [0.         0.16666667 0.83333333 0.83333333 0.16666667 0.16666667]]
[5 5 0 1 0 4 1 0 6 3 1 2 4 2 3 3 1 1 0 2 0 3 0 4 4 6 6 4 0 1 4 4 0 1 2 1 3
 6 0 3 4 1 3 2 4 1 2 0 0 1 3 5 6 0 3 2 0 1 5 0 3 1 6 5]

jnothman · 2019-05-30T03:16:51Z

This behaviour is expected. KMedoids represents each cluster by the training sample that is closest to its middle, not by an arbitrary coordinate in its middle.

liufsd · 2019-05-30T05:01:38Z

OK~ Got it.~

znd4 · 2019-06-03T15:38:30Z

Just got back from a vacation where I intentionally didn't bring my computer :)

I'll try to get to @rth 's comments this week/weekend.

liufsd · 2019-06-04T02:14:41Z

@zdog234 Great~

liufsd · 2019-06-12T03:28:43Z

@zdog234
Is KMedoids possible support set " initial centers." like k-means ???

init : {‘k-means++’, ‘random’ or an ndarray}
Method for initialization, defaults to ‘k-means++’:

‘k-means++’ : selects initial cluster centers for k-mean clustering in a smart way to speed up convergence. See section Notes in k_init for more details.

‘random’: choose k observations (rows) at random from data for the initial centroids.

If an ndarray is passed, it should be of shape (n_clusters, n_features) and gives the initial centers.

Because i want to known: mapping relations of centers and label.
I mean : the first item in centers is the label first ?

liufsd · 2019-07-02T11:42:30Z

large data not work, label all print '0':

    Xnaive = np.array([[1,1,1,2], [1,1,1,2], [1,1,1,1], [1,2,2,3], [0,1,2,2], [1,1,2,2], [0,0,1,0], 
    [1,2,1,2], [1,2,1,2], [1,1,1,1], [1,1,2,3], [1,1,1,1], [1,2,2,2], [1,2,2,3], [1,1,1,1], [1,2,1,2], 
    [1,1,1,1], [0,2,1,2], [1,1,1,1], [1,2,2,3], [1,2,0,2], [0,0,1,0], [0,0,0,0], [1,2,1,2], [1,2,1,2], 
    [1,1,1,2], [1,1,0,0], [1,1,0,2], [0,2,2,3], [1,2,2,2], [1,2,1,2], [1,0,0,0], [1,2,1,2], [1,1,2,2],
     [0,0,0,0], [0,0,1,1], [1,2,2,3], [1,2,2,2], [0,1,1,2], [0,1,2,2], [1,1,1,1], [1,2,1,3], [1,1,1,1],
      [1,1,1,1], [1,2,2,3], [0,2,1,1], [1,2,2,3], [1,2,1,2], [1,1,1,1], [0,1,0,0], [0,1,1,1], [1,2,2,3], 
      [0,0,0,0], [1,2,1,2], [0,0,1,1], [1,1,1,2], [0,0,0,0], [0,0,0,0], [1,1,1,1], [1,2,1,2], [1,2,1,2], 
      [1,2,1,2], [1,1,2,2], [1,2,2,3], [0,0,0,0], [1,2,1,1], [0,1,0,0], [1,2,2,2], [0,0,0,0], [0,0,0,0],
       [0,0,1,0], [1,1,2,2], [1,0,0,0], [1,2,2,2], [0,1,1,1], [1,1,2,2], [1,1,2,2], [1,1,1,2], [1,1,1,1], 
       [1,1,2,2], [0,0,0,0], [0,0,0,0], [1,2,1,2], [1,2,1,2], [1,0,1,1], [1,1,1,1], [1,1,1,1], [0,1,2,2], 
       [1,1,2,2], [1,0,1,0], [0,0,0,0], [1,1,1,1], [1,2,2,3], [1,1,2,2], [1,2,1,2], [1,1,2,3], [0,1,1,1], [1,2,1,2], [0,1,2,2], 
       [1,1,1,2], [0,2,1,2], [0,1,0,0], [1,2,2,3], [1,1,2,2], [1,1,1,1], [0,0,0,0], [1,1,1,1], [1,1,1,1], [1,1,2,2], [1,2,2,3], [1,2,2,2], 
       [0,0,0,0], [0,2,1,2], [0,0,0,1], [1,1,1,0], [1,1,1,1], [1,1,2,2], [1,2,1,2], [1,1,1,1], [0,0,0,0], [1,1,1,1], [1,1,1,1], [1,1,1,0],
        [1,2,1,2], [0,0,0,1], [1,2,1,2], [1,1,1,1], [1,2,2,3], [1,2,1,2], [1,2,1,2], [1,1,2,1], [1,2,0,2], [1,1,2,2], [1,2,2,3], [1,2,1,2], 
        [1,1,1,1], [1,1,2,2], [1,1,1,1], [1,1,2,2], [0,0,0,0], [1,2,2,3], [1,1,1,1], [1,2,2,3], [1,1,1,2], [1,1,1,1], [1,1,2,3], [1,2,2,3], [1,1,2,2],
         [1,2,1,1], [1,2,1,2], [1,1,1,1], [1,1,1,1], [1,1,1,1], [1,1,1,2], [1,1,1,2], [1,1,1,1], [1,1,2,2], [1,1,1,1], [1,2,1,3], [0,1,0,0], 
         [0,0,0,0], [1,1,1,1], [1,2,1,2], [1,1,2,2], [0,0,0,0], [1,2,2,2], [1,1,2,2], [1,1,2,2], [1,2,1,2], [1,2,2,3], [1,2,1,2], [1,2,2,3], [1,2,2,2], 
         [1,2,2,3], [1,1,2,3], [1,1,1,1], [1,1,2,3], [1,1,2,2], [1,1,1,2], [1,1,2,2], [1,1,1,2], [0,0,0,0], [1,2,1,2], [1,1,1,2], [1,2,2,3], [1,1,2,2], 
         [0,1,2,2], [0,0,0,0], [1,1,2,2], [1,2,2,3], [1,1,1,2], [1,1,1,2], [1,0,2,2], [1,0,2,2], [0,0,0,0], [0,0,0,0], [1,1,2,2], [1,2,1,2], [1,1,2,2],
          [1,1,2,2], [1,2,2,3], [0,1,2,3], [1,2,1,2], [1,1,1,1], [0,0,0,0], [1,2,2,2], [1,1,2,2], [1,2,2,3], [1,0,0,0], [0,1,2,2], [1,1,2,2], [1,1,1,1],
           [0,1,2,3], [0,0,0,0], [1,1,2,3], [1,1,2,2], [1,1,1,1], [1,1,1,1], [1,2,0,2], [1,1,1,2], [1,2,1,1], [0,0,0,0], [0,0,0,0], [0,0,0,0], [1,1,1,1], 
           [1,2,2,3], [1,1,0,1], [1,1,2,2], [1,2,2,3], [1,1,2,3], [0,0,0,0], [0,0,0,1], [1,1,1,2], [1,1,1,1], [0,2,2,2], [0,0,0,0], [1,2,2,3], [1,1,2,2], [1,1,1,2], 
           [1,2,2,3], [0,1,2,2], [0,0,0,0], [0,2,1,2], [1,2,2,2], [1,1,0,1], [1,1,2,2], [1,1,1,1], [0,0,0,0], [1,2,1,2], [0,0,0,0], [1,2,1,2], [0,0,0,0], [1,1,2,2], [1,2,1,2], [0,0,0,0], [1,2,1,2], [1,0,1,1], [1,1,1,1], [1,1,2,2], [0,0,0,0], [1,2,2,3],
            [1,2,1,2], [1,1,1,0], [1,2,2,2], [1,2,1,2], [0,0,0,0], [1,1,1,2], [0,0,0,0], [1,1,2,2], [1,1,1,1], [1,1,2,2], [1,2,2,3], [1,1,1,1], [1,2,1,2], [1,2,2,3], [0,1,1,1], [0,0,1,0], [0,0,1,0], [1,1,1,2],
            [1,1,2,2], [1,1,1,2], [1,1,1,1], [0,0,0,1], [1,1,1,1], [0,2,1,2], [0,0,0,0], [0,0,0,0], [1,2,1,2], [1,1,1,1], [1,2,2,3], [1,2,1,1], [1,2,2,3], [1,2,1,1], [1,1,2,3], [1,1,1,1], [0,2,1,2], [1,1,1,1], [1,2,2,2], [1,2,1,2], [1,2,1,2], [1,1,1,2], 
            [1,2,2,3], [1,2,2,3], [1,2,2,2], [1,2,2,3], [1,2,1,2], [0,0,0,0], [1,2,2,3], [0,0,0,0], [0,0,0,0], [1,2,2,3], [0,0,0,0], [1,1,2,2], [0,0,0,0], [0,0,1,0], [0,2,2,2], [0,0,0,0], [1,2,2,3], [1,2,2,3], [0,1,0,0], [1,2,2,3], [1,2,2,3], [1,2,2,3], [0,0,0,0], 
            [1,1,2,2], [1,1,1,2], [0,1,0,0], [1,2,1,3], [1,1,1,2], [1,1,2,2], [1,2,1,2], [0,0,0,1], [1,2,1,2], [1,1,2,2], [1,1,1,1], [1,2,2,3], [1,2,1,2], [1,2,0,2], [0,1,2,3], [0,0,0,0], [1,2,1,1], [1,1,1,1], [1,1,2,1], [0,0,0,0], [1,1,2,2], [0,0,0,0], [1,1,2,2], 
            [1,2,1,2], [0,0,0,0], [1,1,1,1], [1,1,1,0], [1,2,2,3], [1,1,1,2], [1,2,2,2], [1,2,2,3], [1,2,1,2], [1,2,2,3], [0,2,2,3], [1,2,2,3], [1,2,2,3], [1,0,0,0], [0,1,1,1], [1,1,1,1], [1,2,1,2], [1,1,1,1], [1,2,2,3], [0,2,2,3], [1,1,1,1], [1,2,2,3], [0,2,1,3], 
            [1,1,1,1], [1,1,2,2], [1,2,1,2], [1,1,2,2], [1,0,1,2], [1,2,2,3], [1,2,2,2], [0,1,1,1], [0,0,0,0], [1,1,2,2], [1,1,2,2], [1,2,2,3], [1,1,2,1], [1,1,2,2], [0,0,0,0], [1,1,2,2], [0,0,0,0], [0,0,0,1], [1,1,1,2], [0,1,2,1], [1,2,2,3], [1,2,2,3], [1,2,2,3], 
            [1,1,2,2], [1,0,1,0], [1,2,1,2], [1,2,1,1], [1,1,2,2], [1,1,1,1], [0,2,2,3], [0,0,0,0], [1,2,1,2], 
            [1,1,2,2], [0,2,2,2], [1,2,2,3], [1,2,1,1], [0,0,0,0], [1,1,1,1], [1,1,1,1], [1,2,2,3], [1,2,1,3], [1,2,2,3], [1,1,2,2], [1,1,2,1], [1,1,0,0], [1,1,1,1], [1,2,2,3], [1,2,1,2], [1,1,2,2], [1,1,1,1], [1,2,1,2], [1,2,1,3], [0,1,0,1], [0,0,0,0], [1,2,1,2], 
            [0,0,0,0], [1,1,2,2], [0,0,0,0], [1,2,1,2], [1,2,0,2], [1,1,1,1], [1,2,2,3], [1,2,1,2], [0,0,0,0], [1,1,1,1], [1,2,1,2], [0,0,0,0], [1,1,2,2], [1,0,2,2], [1,1,1,1], [1,1,2,2], [1,2,2,3], [0,2,2,3], [0,0,0,0], [1,2,1,3], [0,0,0,0], [1,2,2,2], [0,0,0,0],
             [0,2,2,3], [0,0,0,0], [0,0,0,0], [1,2,1,2], [1,2,2,3], [0,0,1,0], [0,2,1,2], [1,2,2,3], [1,1,2,2], [1,2,1,3], [1,2,1,3], [0,0,0,0], [0,2,2,3], [1,1,1,1], [1,1,2,2], [1,1,1,1], [1,2,2,3], [1,2,2,3], [0,0,0,0], [1,2,1,2], [1,1,2,3], [1,2,1,2], [1,1,2,2], 
             [1,2,1,2], [1,2,2,3], [0,2,1,1], [1,1,2,2], [1,2,2,3], [1,1,1,1], [1,2,1,2], [1,1,2,2], [1,2,0,3], [1,1,1,1], [0,1,1,1], 
    [1,1,1,2], [1,2,1,2], [1,2,1,2], [1,2,2,3], [0,0,0,0], [1,2,1,3], [1,2,2,2], [1,2,1,2], [1,1,1,1],
     [0,1,1,1], [0,1,0,0], [1,1,0,0], [1,1,2,2], [0,0,0,0], [1,2,2,3], [0,1,0,1], [0,0,0,0], [1,2,1,1]])

    model = KMedoids(n_clusters=5, random_state=0).fit(Xnaive)
    print(model.cluster_centers_)
    print(model.labels_)

znd4 · 2019-07-26T04:39:03Z

Looks good overall. A few minor comments,

* could you please add the `KMedoids` estimator to [/sklearn_extra/tests/test_common.py@`master`](https://github.com/scikit-learn-contrib/scikit-learn-extra/blob/master/sklearn_extra/tests/test_common.py) [](/scikit-learn-contrib/scikit-learn-extra/blob/HEAD@{2019-05-24T10:29:20Z}/sklearn_extra/tests/test_common.py) to the list of estimators on which `check_estimator` is run

* please also add the example from the original PR under `examples/`

* if you could rename the file `k_medoid_.py` to `_k_medoids.py` so that it becomes a private export when not used as `from sklearn_extra.cluster import KMedoids` that would be great.

The docs is rendereed here 25-173284824-gh.circle-artifacts.com/0/doc/user_guide.html no need to worry about formatting to much, we can fix that later.

Thanks @rth. When you get a chance, could you let me know if the changes I've made address your concerns?

Best 😊

znd4 · 2019-07-26T05:01:18Z

@liufsd I just added k-medoids++ to the docstring for KMedoids. It basically follows the same process as k-means++.

As for specifying the starting medoids, that hasn't been implemented yet, but it shouldn't be too much work to add that in -- I'm totally down to submit a PR for that.

rth

A few more comments @zdog234 , otherwise (after a light review) LGTM.

We adopted black for code style recently. Please run black sklearn_extra/ examples/ for fixing the linter CI.

I would rather we merged this and opened follow up issues than keep this PR open until everything is perfect.

Maybe @jeremiedbb who worked on KMeans lately would also have some comments.

Later it would be nice to add an example on some dataset where KMedoids is a better than existing scikit-learn clustering algorithms as discussed in scikit-learn/scikit-learn#11099 (comment)

rth · 2019-07-26T12:59:53Z

sklearn_extra/cluster/_k_medoids.py

+                             "than the number of samples %d."
+                             % (self.n_clusters, X.shape[0]))
+
+        D = pairwise_distances(X, metric=self.metric)


So the scaling is O(N**2) because of this distance calculation, right? For the medioid assignment, wouldn't it be more efficient re-compute the nearest-neighbours between the medoids and the samples at each iteration? That would only be O(n_clusters*N) at each iteration, and assuming n_clusters*max_iter < n_samples it should still be faster?

I suppose because there are typically few clusters, constructing a BallTree instead of brute force nearest neighbors is not worth it?

Not asking to make this change now, just wondering if we should open an issue about this once it's merged.

In the above comment I forgot that the distance matrix is also used in _update_medoid_idxs_in_place: there recomputing distances would be O(n_clusters*(n_samples/n_clusters)**2) so indeed probably slower, but it might still be interesting as it wouldn't require storing the full distance matrix..

Anyway, we might want to replace pairwise_distances with pairwise_distances_chunked to reduce memory usage in the current implementation..

That's a really interesting idea. I can try to take a look at pairwise_distances_chunked for a future PR 🙂

rth · 2019-07-26T13:02:31Z

sklearn_extra/cluster/_k_medoids.py

+
+        Parameters
+        ----------
+        distances : {array-like, sparse matrix}, shape=(n_samples, n_clusters)


I guess this is always a dense matrix?

I'm not sure if there are tests for this functionality, but there is a line that makes me think that sparse arrays are accepted (X = check_array(X, accept_sparse=["csr", "csc"])

sklearn_extra/cluster/_k_medoids.py

doc/conf.py

examples/plot_kmedoids_digits.py

mentnioning k_means_._k_init copypasta

liufsd · 2019-07-29T03:12:11Z

@zdog234 yes, run success when set 'k-medoids++' and 'manhattan' distance. but centers order not same like input data order. i wish the output order would like my input data order.

input origin centers:

[[0 0 0 0]
 [1 1 1 1]
 [1 2 1 2]
 [1 1 2 2]
 [1 2 2 3]]

output centers by 'manhattan' :

[[1 2 2 2]
 [0 0 0 0]
 [1 1 1 1]
 [1 2 2 3]
 [1 2 0 2]]

output centers by 'euclidean' :

[[1 2 1 2]
 [0 0 0 0]
 [1 1 1 1]
 [1 2 2 3]
 [1 1 2 2]]

rth

Thanks @zdog234 ! LGTM.

This PR has been open since April and I don't think there is a point in waiting for further feedback. Merging. We should rather open follow-up issues for possible issues or improvements.

rth · 2019-07-29T12:22:04Z

yes, run success when set 'k-medoids++' and 'manhattan' distance. but centers order not same like input data order

@zdog234 Would you mind opening a separate issues with a reproducible example and the expected/obtained result? Thanks!

rth · 2019-07-29T12:23:48Z

BTW, I added an empty sklearn_extra/cluster/tests/__init__.py as otherwise clustering tests were not run in CI resulting in coverage failures.

jnothman · 2019-07-29T13:00:18Z

Seeing this finally merged is very comforting!! Thanks everyone for your work.

znd4 added 7 commits April 6, 2019 21:37

Added kmedoids code

cd19b57

changed k_medoids_ imports to absolute

3e18444

Merge branch 'master' of https://github.com/scikit-learn-contrib/scik…

936919d

…it-learn-extra into kmedoids

Added .vscode to .gitignore

d4c086c

Add venv to .gitignore

bacc931

Added cluster tests

0cb8e43

Fix KMedoids docstring

96f3a2e

Reconfigure _kpp_init tests

8d9d9d6

rth reviewed May 7, 2019

View reviewed changes

added documentation

8e534e8

rth reviewed May 24, 2019

View reviewed changes

rth mentioned this pull request May 27, 2019

Please support custom distance function for k-means scikit-learn/scikit-learn#13956

Closed

znd4 added 6 commits July 25, 2019 22:33

Rename k_medoids_.py -> _k_medoids.py

4d61529

Update conf.py to include mathjax

03f9e54

Add KMedoids to test_common.py

2e95287

add plot_kmedoids_digits.py

0e1ee5b

Add Examples line to KMedoids docstring

ee1688b

Remove duplicate examples section in _k_medoids.py docstring

e96e2b0

znd4 added 2 commits July 26, 2019 00:26

ACTUALLY remove duplicate examples section

07f6e3c

Add sphinx gallery of plot_kmedoids_digits.py

9910804

Added k-medoids++ to help message

0c8d032

Merge branch 'master' into kmedoids

0368daa

rth reviewed Jul 26, 2019

View reviewed changes

znd4 added 6 commits July 27, 2019 09:16

Run black on code

3d71001

Remove commented out math code

182d505

Remove unnecessary plot_kmedoids_digits.py

88d9630

Remove x_squared_norms from _kpp_init (copied over from kmeans)

9405d98

Add comment for _kpp_init

0989f88

mentnioning k_means_._k_init copypasta

update n_samples -> n_query, where appropriate

d76d6b8

znd4 mentioned this pull request Jul 28, 2019

Add example where KMedoids is better than existing scikit-learn clustering algorithms #22

Open

Add sklearn_extra/cluster/tests/__init__.py

c060b0e

rth approved these changes Jul 29, 2019

View reviewed changes

rth merged commit fa8d1fe into scikit-learn-contrib:master Jul 29, 2019

This was referenced Jul 29, 2019

KMedoids implementation #8

Closed

Reduce memory requirements in KMedoids #23

Open

jnothman mentioned this pull request Jan 26, 2020

KMedoids implementation Part Three scikit-learn/scikit-learn#11099

Closed


		return medoids

		def _kpp_init(self, D, n_clusters, random_state_, n_local_trials=None):

Implementing KMedoids in scikit-learn-extra #12

Implementing KMedoids in scikit-learn-extra #12

Uh oh!

Conversation

znd4 commented Apr 29, 2019

Uh oh!

znd4 commented Apr 29, 2019

Uh oh!

znd4 commented May 1, 2019

Uh oh!

rth left a comment

Choose a reason for hiding this comment

Uh oh!

rth May 7, 2019

Choose a reason for hiding this comment

Uh oh!

znd4 commented May 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented May 21, 2019 via email

Uh oh!

rth left a comment

Choose a reason for hiding this comment

Uh oh!

liufsd commented May 27, 2019

Uh oh!

liufsd commented May 30, 2019

Uh oh!

jnothman commented May 30, 2019 via email

Uh oh!

liufsd commented May 30, 2019

Uh oh!

znd4 commented Jun 3, 2019

Uh oh!

liufsd commented Jun 4, 2019

Uh oh!

liufsd commented Jun 12, 2019

Uh oh!

liufsd commented Jul 2, 2019

Uh oh!

znd4 commented Jul 26, 2019

Uh oh!

znd4 commented Jul 26, 2019

Uh oh!

rth left a comment

Choose a reason for hiding this comment

Uh oh!

rth Jul 26, 2019

Choose a reason for hiding this comment

Uh oh!

rth Jul 26, 2019

Choose a reason for hiding this comment

Uh oh!

znd4 Jul 28, 2019

Choose a reason for hiding this comment

Uh oh!

rth Jul 26, 2019

Choose a reason for hiding this comment

Uh oh!

znd4 Jul 28, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

liufsd commented Jul 29, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rth left a comment

Choose a reason for hiding this comment

Uh oh!

rth commented Jul 29, 2019

Uh oh!

rth commented Jul 29, 2019

Uh oh!

jnothman commented Jul 29, 2019 via email

Uh oh!

Uh oh!

znd4 commented May 11, 2019 •

edited

Loading

liufsd commented Jul 29, 2019 •

edited

Loading