-
Notifications
You must be signed in to change notification settings - Fork 45
Implementing KMedoids in scikit-learn-extra #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 19 commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
cd19b57
Added kmedoids code
znd4 3e18444
changed k_medoids_ imports to absolute
znd4 936919d
Merge branch 'master' of https://github.com/scikit-learn-contrib/scik…
znd4 d4c086c
Added .vscode to .gitignore
znd4 bacc931
Add venv to .gitignore
znd4 0cb8e43
Added cluster tests
znd4 96f3a2e
Fix KMedoids docstring
znd4 8d9d9d6
Reconfigure _kpp_init tests
znd4 8e534e8
added documentation
znd4 4d61529
Rename k_medoids_.py -> _k_medoids.py
znd4 03f9e54
Update conf.py to include mathjax
znd4 2e95287
Add KMedoids to test_common.py
znd4 0e1ee5b
add plot_kmedoids_digits.py
znd4 ee1688b
Add Examples line to KMedoids docstring
znd4 e96e2b0
Remove duplicate examples section in _k_medoids.py docstring
znd4 07f6e3c
ACTUALLY remove duplicate examples section
znd4 9910804
Add sphinx gallery of plot_kmedoids_digits.py
znd4 0c8d032
Added k-medoids++ to help message
znd4 0368daa
Merge branch 'master' into kmedoids
znd4 3d71001
Run `black` on code
znd4 182d505
Remove commented out math code
znd4 88d9630
Remove unnecessary plot_kmedoids_digits.py
znd4 9405d98
Remove `x_squared_norms` from _kpp_init (copied over from kmeans)
znd4 0989f88
Add comment for _kpp_init
znd4 d76d6b8
update n_samples -> n_query, where appropriate
znd4 c060b0e
Add sklearn_extra/cluster/tests/__init__.py
rth File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
# -*- coding: utf-8 -*- | ||
""" | ||
============================================================= | ||
A demo of K-Medoids clustering on the handwritten digits data | ||
============================================================= | ||
In this example we compare different pairwise distance | ||
metrics for K-Medoids. | ||
""" | ||
import numpy as np | ||
import matplotlib.pyplot as plt | ||
|
||
from collections import namedtuple | ||
from sklearn.cluster import KMeans | ||
from sklearn_extra.cluster import KMedoids | ||
from sklearn.datasets import load_digits | ||
from sklearn.decomposition import PCA | ||
from sklearn.preprocessing import scale | ||
|
||
print(__doc__) | ||
|
||
# Authors: Timo Erkkilä <[email protected]> | ||
# Antti Lehmussola <[email protected]> | ||
# Kornel Kiełczewski <[email protected]> | ||
# License: BSD 3 clause | ||
|
||
np.random.seed(42) | ||
|
||
digits = load_digits() | ||
data = scale(digits.data) | ||
n_digits = len(np.unique(digits.target)) | ||
|
||
reduced_data = PCA(n_components=2).fit_transform(data) | ||
|
||
# Step size of the mesh. Decrease to increase the quality of the VQ. | ||
h = .02 # point in the mesh [x_min, m_max]x[y_min, y_max]. | ||
|
||
# Plot the decision boundary. For that, we will assign a color to each | ||
x_min, x_max = reduced_data[:, 0].min() - 1, reduced_data[:, 0].max() + 1 | ||
y_min, y_max = reduced_data[:, 1].min() - 1, reduced_data[:, 1].max() + 1 | ||
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h)) | ||
|
||
plt.figure() | ||
plt.clf() | ||
|
||
plt.suptitle("Comparing multiple K-Medoids metrics to K-Means and each other", | ||
fontsize=14) | ||
|
||
Algorithm = namedtuple('ClusterAlgorithm', ['model', 'description']) | ||
|
||
selected_models = [ | ||
Algorithm(KMedoids(metric='manhattan', | ||
n_clusters=n_digits), | ||
znd4 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
'KMedoids (manhattan)'), | ||
Algorithm(KMedoids(metric='euclidean', | ||
n_clusters=n_digits), | ||
'KMedoids (euclidean)'), | ||
Algorithm(KMedoids(metric='cosine', | ||
n_clusters=n_digits), | ||
'KMedoids (cosine)'), | ||
Algorithm(KMeans(n_clusters=n_digits), | ||
'KMeans') | ||
] | ||
|
||
plot_rows = int(np.ceil(len(selected_models) / 2.0)) | ||
plot_cols = 2 | ||
|
||
for i, (model, description) in enumerate(selected_models): | ||
|
||
# Obtain labels for each point in mesh. Use last trained model. | ||
model.fit(reduced_data) | ||
Z = model.predict(np.c_[xx.ravel(), yy.ravel()]) | ||
|
||
# Put the result into a color plot | ||
Z = Z.reshape(xx.shape) | ||
plt.subplot(plot_cols, plot_rows, i + 1) | ||
plt.imshow(Z, interpolation='nearest', | ||
extent=(xx.min(), xx.max(), yy.min(), yy.max()), | ||
cmap=plt.cm.Paired, | ||
aspect='auto', origin='lower') | ||
|
||
plt.plot(reduced_data[:, 0], | ||
reduced_data[:, 1], | ||
'k.', markersize=2, | ||
alpha=0.3, | ||
) | ||
# Plot the centroids as a white X | ||
centroids = model.cluster_centers_ | ||
plt.scatter(centroids[:, 0], centroids[:, 1], | ||
marker='x', s=169, linewidths=3, | ||
color='w', zorder=10) | ||
plt.title(description) | ||
plt.xlim(x_min, x_max) | ||
plt.ylim(y_min, y_max) | ||
plt.xticks(()) | ||
plt.yticks(()) | ||
|
||
plt.show() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
from ._k_medoids import KMedoids | ||
|
||
__all__ = [ | ||
'KMedoids', | ||
] |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.