## K Means Clustering

The problem is computationally difficult ( NP-hard ); however, efficient heuristic algorithms converge quickly to a local optimum . These are usually similar to the expectation-maximization algorithm for mixtures of Gaussian distributions via an iterative refinement approach employed by both * k-means* and * Gaussian mixture modeling* . They both use cluster centers to model the data; however, * k* -means clustering tends to find clusters of comparable spatial extent, while the expectation-maximization mechanism allows clusters to have different shapes.

Knowledge of machine learning is not required, but the reader should be familiar with basic data analysis (e.g., descriptive analysis) and the programming language Python. To follow along, download the sample dataset here .