In addition, to all the materials available to students, there are several additional items that are available for instructors upon request.
Access to Instructor Materials
# Student Materials Exercises.docx Exercises.pdf *.dat # data files for Exercises gap.pdf # reference paper for Gap statistic question data2d.m # Octave script for visualizing input data cluster2d.m # Octave script for visualizing output cluster data doc/ # Javadoc documentation for sample solutions iris/ # folder containing Weka tutorial, data, and Exercises ppt/ # Powerpoint and PDF presentations for k-means java/ # Java version of programming exercises with starter code
# Instructor Materials Exercises - Instructor's Guide.docx Exercises - Instructor's Guide.pdf # Sample answers and discussion KMeans.java # Sample Java solution for first exercise KMeansIterated.java # Sample Java solution for second exercise KMeansIteratedGap.java # Sample Java solution for third exercise ex1.sh # script to execute first exercise sample solution code ex2.sh # script to execute second exercise sample solution code ex3.sh # script to execute third exercise sample solution code ex1/ # folder of output clustering data from ex1.sh + summary.txt ex2/ # folder of output clustering data from ex2.sh + summary.txt ex3/ # folder of output clustering data from ex3.sh + summary.txt instructor-soln/ # text files with info on original data generation Clustering Iris Data with Weka (Instructor Copy).docx Clustering Iris Data with Weka (Instructor Copy).pdf # solutions for Weka tutorial ppt/ # Powerpoint and PDF presentations for k-means java/ # sample solution and JUnit test code for Java assignment
The tutorial and copy of the iris data are available in the student materials zip-file in the
Example Clustering data sets
In the Powerpoint provided, k-Means Clustering.pptx and K-Means Clustering.pdf, describing the clustering method, several example data sets are shown illustrating where k-means fails. There are two main sources of these images:
- the textbook and related slides of Tan, Steinbach, and Kumar Introduction to Data Mining (The slides are available at: http://www-users.cs.umn.edu/~kumar/dmbook/index.php#item5, specifically from Chapter 8)
- clustering data sets available at http://cs.joensuu.fi/sipu/datasets/ and a MATLAB script
cluster_examples.m. (The Shape sets were of focus)
World Country Data
A subset of indicators was collected and merged for world countries. The data set is available without preprocessing ex-country-data.csv and in a standardized form ex-country-data-preproc.csv. A richer description of the data and its use is shown in the following example using R: clustering-example.