class: center, middle, inverse, title-slide # K-means --- # Why would we cluster data? We think there is some unknown grouping driving the variation in the data. We want to identify how many subgroups are in the data. We want to then investigate the properties of these subgroups and infer their biological meaning. --- # Clustering can be deceptive <img src="data:image/png;base64,#/home/alan/Documents/github/carpentries/high-dimensional-stats-r/fig/rmd-08-fake-cluster-1.png" width="500px" /> --- # K-means in action <img src="data:image/png;base64,#/home/alan/Documents/github/carpentries/high-dimensional-stats-r/fig/kmeans.gif" width="500px" />