RevMax: Revenue-Maximizing Recommendation System Competition

Fang Sun, Paul Zhang, Pranav Subbaraman and Yizhou Sun
UCLA Computer Science Dept.

Summary

RevMax is a comprehensive recommendation system assignment where students compete to design algorithms that maximize revenue through multi-iteration user-item interactions. Using the Sim4Rec simulation framework, students implement content-based, sequence-based, and graph-based recommenders in a realistic production environment where models continuously learn from user feedback.

Topics

Machine Learning, Recommender Systems, Content-Based Filtering, Collaborative Filtering, Sequential Pattern Mining, Graph Neural Networks, K-Nearest Neighbors, Logistic Regression, Decision Trees, Random Forest, Gradient Boosting, RNNs/LSTMs, Transformers, Graph Convolutional Networks, Link Prediction, Feature Engineering, Online Learning

Audience

Advanced undergraduate or graduate students in Data Mining, Machine Learning, or AI courses. Prerequisites include programming proficiency, basic machine learning concepts, and familiarity with Python/PySpark.

Difficulty

Medium to High difficulty. Students need 2-3 weeks per checkpoint (total 6-9 weeks for the full assignment). Each checkpoint requires implementing increasingly sophisticated algorithms, from basic content-based methods to advanced graph neural networks.

Strengths

• Real-world relevance: Models revenue optimization used in industry recommendation systems
• Comprehensive coverage: Integrates multiple data mining techniques in one cohesive project
• Interactive learning: Multi-iteration environment provides immediate feedback
• Competitive element: Leaderboard motivates students to improve algorithms
• Scalable difficulty: Three checkpoints allow progressive skill building
• Hands-on experience with production ML concepts (train-test splits, online learning, hyperparameter tuning)

Weaknesses

• Computational requirements: Needs Java 17 and sufficient memory for Spark processing
• Setup complexity: Multiple dependencies and frameworks to install
• Time intensive: Full assignment requires significant time investment
• Limited to synthetic data: Real-world recommendation challenges may differ
• Focus on revenue may oversimplify real recommendation system objectives

Dependencies

Prerequisites: Python programming, basic machine learning, linear algebra, probability/statistics
Software: Python 3.8+, Java 17 (OpenJDK), Apache Spark, uv package manager
Hardware: 8GB+ RAM recommended, multi-core processor for Spark
Libraries: PySpark, NumPy, Pandas, Scikit-learn, Matplotlib, Sim4Rec framework

Variants

Instructors can customize the assignment by: (1) Adjusting the number of users/items and feature dimensions to control complexity, (2) Modifying evaluation metrics to emphasize different objectives (e.g., diversity, fairness), (3) Adding constraints like computational budgets or cold-start scenarios, (4) Incorporating real datasets from MovieLens or Amazon reviews, (5) Extending to multi-stakeholder scenarios with advertiser budgets, (6) Adding explainability requirements for recommendations. The modular design allows focusing on specific techniques or expanding to semester-long projects.

Assignment Materials

📄 Assignment Overview and Instructions - Complete guide to the RevMax competition
📁 Checkpoint Directives - Detailed instructions for each of the three checkpoints:
💻 Baseline Implementations - Reference implementations including Random, Popularity, and Content-Based recommenders
🔧 Simulation Framework - Core simulation engine for user-item interactions
📊 Evaluation Metrics - Implementation of revenue, precision, NDCG, and other metrics
🏆 Leaderboard System - Competition platform for tracking student submissions
📈 Analysis and Visualization Tools - Scripts for EDA and performance analysis
⚙️ Requirements File - All Python dependencies
🚀 Example Notebooks - Jupyter notebooks demonstrating data exploration and algorithm development

Getting Started

Students should begin by:

Installing dependencies using the provided setup instructions
Running the baseline analysis script to understand the data and metrics
Implementing the MyRecommender class starting with Checkpoint 1
Testing algorithms locally before submitting to the leaderboard

Support Resources

📚 Recommended readings on recommendation systems and online learning
💬 Discussion forum for student questions and clarifications
🎥 Video tutorials on PySpark and the Sim4Rec framework
📝 Sample solution walkthrough (released after assignment completion)