K-Means ClusteringΒΆ

K-Means Clustering algorithm is a simple unsupervised learning algorithm used to solve clustering problems. By assuming \(k\) clusters, it minimizes the sum of distances (points to cluster centroids) through iteration.

For details refer to the online tutorial http://www-2.cs.cmu.edu/~awm/tutorials/kmeans.html.

Input Parameters

Parameter Type Constraint Description Remarks
\(Y\) \(Y \in \mathbb R^{N}\) \(N \in \mathbb{N}\) Input data of size \(N\) None
\(k\) \(k \in \mathbb{N}\) \(k \lt N\) Specified number of clusters None

Output Parameters

Parameter Type Constraint Description Remarks
\(\hat{Y}\) \(\hat{Y} \in \mathbb R^{k}\) None A vector of \(k\) cluster centroid locations None

Single Steps using the Algorithm

References

  • J.B. MacQueen, Some Methods for classification and Analysis of Multivariate Observations, Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, vol. 1, pp. 281-297, 1967.