Skip to content

Commit 14683c5

Browse files
committed
updated benchmarks
1 parent b87204a commit 14683c5

File tree

3 files changed

+319
-228
lines changed

3 files changed

+319
-228
lines changed

docs/src/index.md

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,14 @@ This package aims to utilize the speed of Julia and parallelization (both CPU &
1414

1515

1616
## K-Means Algorithm Implementation Notes
17-
Explain main algos and some few lines about the input dimension as well as
17+
Since Julia is a column major language, the input (design matrix) expected by the package in the following format;
18+
19+
- Design matrix X of size n×m, the i-th column of X `(X[:, i])` is a single data point in n-dimensional space.
20+
- Thus, the rows of the design design matrix represents the feature space with the columns representing all the training examples in this feature space.
21+
22+
One of the pitfalls of K-Means algorithm is that it can fall into a local minima.
23+
This implementation inherits this problem like every implementation does.
24+
As a result, it is useful in practice to restart it several times to get the correct results.
1825

1926
## Installation
2027
You can grab the latest stable version of this package by simply running in Julia.
@@ -51,6 +58,8 @@ K-Means"](https://www.aaai.org/Papers/ICML/2003/ICML03-022.pdf).
5158
- [ ] GPU support.
5259
- [ ] Even faster Kmeans implementation based on current literature.
5360
- [ ] Optimization of code base.
61+
- [ ] Improved Documentation
62+
- [ ] More benchmark test beyond `Scikit-Learn` and `Clustering.jl`
5463

5564

5665
## How To Use
@@ -80,14 +89,13 @@ iris = dataset("datasets", "iris");
8089
# features to use for clustering
8190
features = collect(Matrix(iris[:, 1:4])');
8291

92+
# various artificats can be accessed from the result ie assigned labels, cost value etc
8393
result = kmeans(features, 3);
8494

8595
# plot with the point color mapped to the assigned cluster index
8696
scatter(iris.PetalLength, iris.PetalWidth, marker_z=result.assignments,
8797
color=:lightrainbow, legend=false)
8898

89-
# TODO: Add scatter plot image
90-
9199
```
92100

93101
![Image description](iris_example.jpg)
@@ -105,12 +113,12 @@ c = [ParallelKMeans.kmeans(X, i; tol=1e-6, max_iters=300, verbose=false).totalco
105113

106114
# Single Thread Implementation plus a modified version of Elkan's triangiulity of inequaltiy
107115
# to boost speed
108-
e = [ParallelKMeans.kmeans(LightElkan(), X, i;
116+
d = [ParallelKMeans.kmeans(LightElkan(), X, i;
109117
n_threads=1, tol=1e-6, max_iters=300, verbose=false).totalcost for i = 2:10]
110118

111119
# Multi Thread Implementation plus a modified version of Elkan's triangiulity of inequaltiy
112120
# to boost speed
113-
d = [ParallelKMeans.kmeans(LightElkan(), X, i;
121+
e = [ParallelKMeans.kmeans(LightElkan(), X, i;
114122
tol=1e-6, max_iters=300, verbose=false).totalcost for i = 2:10]
115123
```
116124

0 commit comments

Comments
 (0)