Implementing a Recommender system using MapReduce

Summary	This assignment combines two prominent machine learning / big data technologies: MapReduce and recommendation. Nearly 20 years ago, as Google scaled, they created several new technologies. Foundational amongst those technologies are the resilient Google File System (GFS) and a computing paradigm known as MapReduce. Google published their work in a few highly influential papers. Researchers, inspired by these descriptions, created open source versions, which eventually became the big data platform now known as Hadoop. Yelp create a Python based library to write MapReduce programs known as mrjob. Based on user preferences, recommender systems produce an ordered set of recommendations. In this series of exercises, students gain hands on experience with how user-based, item-based, and content based recommender systems work. Modeling the core computation in a spreadsheet helps convey the essence of these algorithms. With this background students then express the recommender algorithm in the MapReduce paradigm using mrjob. Experiments are done using the Movielens data set https://grouplens.org/datasets/movielens/ . The assignment also forms the context for discussing the Netflix prize.
Topics	MapReduce, Recommender Systems, Big Data machine learning
Audience	Advanced students of AI; could also be used in a CS2 class as an extended assignment
Difficulty	Medium
Strengths	Explores a prominent AI application (recommendation)
Weaknesses	Need to have sufficient time in the course schedule to discuss the MapReduce paradigm
Dependencies
Variants	Recommendation can be done without using MapReduce