================== K-Means Clustering ================== :doc:`/WorkProcessClassifiers/GlobalAlgorithm/index` - :doc:`/WorkProcessClassifiers/OneDimensionalAlgorithm/index` *K-Means Clustering* algorithm is a simple unsupervised learning algorithm used to solve clustering problems. By assuming :math:`k` clusters, it minimizes the sum of distances (points to cluster centroids) through iteration. For details refer to the online tutorial `http://www-2.cs.cmu.edu/~awm/tutorials/kmeans.html `__. .. rubric:: Input Parameters +--------------------+--------------------------------------------+----------------------------------------+---------------------------------------+---------+ | Parameter | Type | Constraint | Description | Remarks | +====================+============================================+========================================+=======================================+=========+ | :math:`Y` | :math:`Y \in \mathbb R^{N}` | :math:`N \in \mathbb{N}` | Input data of size :math:`N` | | +--------------------+--------------------------------------------+----------------------------------------+---------------------------------------+---------+ | :math:`k` | :math:`k \in \mathbb{N}` | :math:`k \lt N` | Specified number of clusters | | +--------------------+--------------------------------------------+----------------------------------------+---------------------------------------+---------+ .. rubric:: Output Parameters +----------------------------+----------------------------------------------------+------------+-----------------------------------------------------------+---------+ | Parameter | Type | Constraint | Description | Remarks | +============================+====================================================+============+===========================================================+=========+ | :math:`\hat{Y}` | :math:`\hat{Y} \in \mathbb R^{k}` | | A vector of :math:`k` cluster centroid locations | | +----------------------------+----------------------------------------------------+------------+-----------------------------------------------------------+---------+ .. rubric:: Tool Support * :doc:`/Tools/MatlabTool/index` For details refer to the online documentation of the function `'kmeans' `__. .. rubric:: Single Steps using the Algorithm * :doc:`/DataPreprocessing/DataDiscretization/DataDiscretizationWithKMeansClustering/index` * :doc:`/DataPreprocessing/DataReduction/DimensionalityReduction/DataReductionWithKMeansClustering/index` * :doc:`/DataPreprocessing/DataCleaning/OutlierDetection/OutlierDetectionWithKMeansClustering/index` .. rubric:: References - J.B.\ MacQueen, Some Methods for classification and Analysis of Multivariate Observations, Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, vol. 1, pp. 281-297, 1967.