# K-Means Clustering¶

K-Means Clustering algorithm is a simple unsupervised learning algorithm used to solve clustering problems. By assuming $$k$$ clusters, it minimizes the sum of distances (points to cluster centroids) through iteration.

For details refer to the online tutorial http://www-2.cs.cmu.edu/~awm/tutorials/kmeans.html.

Input Parameters

Parameter Type Constraint Description Remarks
$$Y$$ $$Y \in \mathbb R^{N}$$ $$N \in \mathbb{N}$$ Input data of size $$N$$
$$k$$ $$k \in \mathbb{N}$$ $$k \lt N$$ Specified number of clusters

Output Parameters

Parameter Type Constraint Description Remarks
$$\hat{Y}$$ $$\hat{Y} \in \mathbb R^{k}$$   A vector of $$k$$ cluster centroid locations

Tool Support

Single Steps using the Algorithm

References

• J.B. MacQueen, Some Methods for classification and Analysis of Multivariate Observations, Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, vol. 1, pp. 281-297, 1967.