Resources
Readings
Notation
- Notation Guide PDF
Textbooks
- [AIMA] Russell and Norvig. Artificial Intelligence: a modern approach
- Textbook;
- No k-means coverage - only EM
- [ESL] Hastie, Tibshirani, and Friedman. The Elements of Statistical Learning (PDF)
- Textbook, free online;
- Section 14.3.6, with mathematical context to problem
- [ISL] James, Witten, Hastie, Tibshirani. An Introduction to Statistical Learning with Applications in R (PDF)
- Textbook, free online;
- Section 10.3.1, more explanatory than [ESL], lacks discussion of how to choose k
- Includes R lab for experimenting with k-means, Section 10.5.1
- [ItML] Alpaydin. Introduction to Machine Learning (Lecture Slides PDF and PPT)
- Textbook;
- brief coverage of k-means
General References
- [OMP] http://www.onmyphd.com/?p=k-means.clustering
- Website;
- Interactive visualization of algorithms, aids in understanding strengths and weaknesses of method
- [WKMC] Wikipedia - K-Means Clustering
- Website;
- basic definitions, list of online resources, implementations, etc.
- [WUL] Wikipedia - Unsupervised Learning
- Website;
- succinct definitions
- [WDTNOC] Wikipedia - Determining the number of clusters in a data set
- Website;
- less detailed than ESL 14.3.11
- [WJM] Wikipedia - K-Medoids
- Website;
- concise, clear, with visual example where K-means fails and K-medoids succeeds
Online Resources
- Links available at WKMC 8 includes links to software (some open-source), visualizations, demos, etc.
- Video illustrating Voronoi partitioning with uniform distribution YouTube
- Teaching video: YouTube
MOOC Resources
Here is a sampling of some of the MOOC courses that have some coverage of clustering. Note, there are many other courses that may contain material relating to clustering and k-means clustering.
-
Coursera, Machine Learning, Stanford University
Andrew Ng
Free, rolling enrollment through the year
https://www.coursera.org/learn/machine-learning
The upcoming course covers unsupervised learning and dimensionality reduction in Week 8. -
Coursera, Cluster Analysis in Data Mining, University of Illinois at Urbana-Champaign
Jiawei Han
Part of Data Mining Specialization
Archive of course from April-June 2015
https://www.coursera.org/specialization/datamining/20?utm_medium=courseDescripTop
https://www.coursera.org/course/clusteranalysis -
Coursera, Clustering & Retrieval, University of Washington
Emily Fox, Carlos Guestrin
Part of the Machine Learning Specialization, starts at $79, enroll for Feb. 2016
https://www.coursera.org/specializations/machine-learning
https://www.coursera.org/learn/ml-clustering-and-retrieval
Projects related to document retrieval; cluster by topics -
edX, MITx: 15.071x The Analytics Edge, Massachusetts Institute of Technology
Multiple instructors
Free, course archive https://courses.edx.org/courses/course-v1:MITx+15.071x_2a+2T2015/info
Project of using clustering documents in R -
Udacity, Machine Learning, Georgia Tech
Michael Littman, Charles Isbell
Free, course archive https://www.udacity.com/course/machine-learning--ud262