RevMax: Revenue-Maximizing Recommendation System Competition

Fang Sun, Paul Zhang, Pranav Subbaraman and Yizhou Sun
UCLA Computer Science Dept.

Summary RevMax is a comprehensive recommendation system assignment where students compete to design algorithms that maximize revenue through multi-iteration user-item interactions. Using the Sim4Rec simulation framework, students implement content-based, sequence-based, and graph-based recommenders in a realistic production environment where models continuously learn from user feedback.
Topics Machine Learning, Recommender Systems, Content-Based Filtering, Collaborative Filtering, Sequential Pattern Mining, Graph Neural Networks, K-Nearest Neighbors, Logistic Regression, Decision Trees, Random Forest, Gradient Boosting, RNNs/LSTMs, Transformers, Graph Convolutional Networks, Link Prediction, Feature Engineering, Online Learning
Audience Advanced undergraduate or graduate students in Data Mining, Machine Learning, or AI courses. Prerequisites include programming proficiency, basic machine learning concepts, and familiarity with Python/PySpark.
Difficulty Medium to High difficulty. Students need 2-3 weeks per checkpoint (total 6-9 weeks for the full assignment). Each checkpoint requires implementing increasingly sophisticated algorithms, from basic content-based methods to advanced graph neural networks.
Strengths • Real-world relevance: Models revenue optimization used in industry recommendation systems
• Comprehensive coverage: Integrates multiple data mining techniques in one cohesive project
• Interactive learning: Multi-iteration environment provides immediate feedback
• Competitive element: Leaderboard motivates students to improve algorithms
• Scalable difficulty: Three checkpoints allow progressive skill building
• Hands-on experience with production ML concepts (train-test splits, online learning, hyperparameter tuning)
Weaknesses • Computational requirements: Needs Java 17 and sufficient memory for Spark processing
• Setup complexity: Multiple dependencies and frameworks to install
• Time intensive: Full assignment requires significant time investment
• Limited to synthetic data: Real-world recommendation challenges may differ
• Focus on revenue may oversimplify real recommendation system objectives
Dependencies Prerequisites: Python programming, basic machine learning, linear algebra, probability/statistics
Software: Python 3.8+, Java 17 (OpenJDK), Apache Spark, uv package manager
Hardware: 8GB+ RAM recommended, multi-core processor for Spark
Libraries: PySpark, NumPy, Pandas, Scikit-learn, Matplotlib, Sim4Rec framework
Variants Instructors can customize the assignment by: (1) Adjusting the number of users/items and feature dimensions to control complexity, (2) Modifying evaluation metrics to emphasize different objectives (e.g., diversity, fairness), (3) Adding constraints like computational budgets or cold-start scenarios, (4) Incorporating real datasets from MovieLens or Amazon reviews, (5) Extending to multi-stakeholder scenarios with advertiser budgets, (6) Adding explainability requirements for recommendations. The modular design allows focusing on specific techniques or expanding to semester-long projects.

Assignment Materials

Getting Started

Students should begin by:

  1. Installing dependencies using the provided setup instructions
  2. Running the baseline analysis script to understand the data and metrics
  3. Implementing the MyRecommender class starting with Checkpoint 1
  4. Testing algorithms locally before submitting to the leaderboard

Support Resources