Model AI Assignment AAAI — RL for Combinatorial Structures

Summary	The goal in this assignment is to learn how to use reinforcement learning to construct combinatorial structures satisfying certain constraints.
Topics	Reinforcement learning, Cross-entropy method, Combinatorial Optimization
Audience	Beginners in Reinforcement Learning (RL) and people who are particularly interested in learning how to use RL for combinatorics.
Difficulty	Conceptually challenging and implementation may take over 15 hours. This will vary depending on mathematical maturity and programming skills. The core idea of using a neural network to learn a bit vector that maps to an optimal permutation requires a deep understanding of the entire system.
Strengths	The entire assignment can be run in a Google Colab environment without significant resources. The assignment forces thinking about suitable encoding and scoring function that accurately reflects the particular combinatorial situation, rather than blindly using RL. Since the objectives in the tasks are intuitive and conceptually simple, the performance of the algorithm is easily visible and therefore the effects of different choices are very clear. The tasks include sufficient variety in the combinatorial optimization problems, each requiring a different encoding and scoring approach.
Weaknesses	We are not suggesting changing the reinforcement learning algorithm. While that is a possibility, empirically, the cross-entropy method seems to be the most suitable. The assignment is therefore only about using a specific RL technique to discover combinatorial structures.
Dependencies	Requires knowledge of Adam Wagner’s work ( paper, code). The code requires Keras + Tensorflow + Python.
Variants	We have given two examples from simple combinatorics and two from Geometry, but one can easily find similar examples from Graph theory such as the degree–diameter problem.