| Summary |
The goal in this assignment is to learn how to use reinforcement learning to construct combinatorial structures satisfying certain constraints.
|
| Topics |
Reinforcement learning, Cross-entropy method, Combinatorial Optimization |
| Audience |
Beginners in Reinforcement Learning (RL) and people who are particularly interested in learning how to use RL for combinatorics. |
| Difficulty |
Conceptually challenging and implementation may take over 15 hours. This will vary depending on mathematical maturity and programming skills.
The core idea of using a neural network to learn a bit vector that maps to an optimal permutation requires a deep understanding of the entire system.
|
| Strengths |
- The entire assignment can be run in a Google Colab environment without significant resources.
- The assignment forces thinking about suitable encoding and scoring function that accurately reflects the particular combinatorial situation, rather than blindly using RL.
- Since the objectives in the tasks are intuitive and conceptually simple, the performance of the algorithm is easily visible and therefore the effects of different choices are very clear.
- The tasks include sufficient variety in the combinatorial optimization problems, each requiring a different encoding and scoring approach.
|
| Weaknesses |
We are not suggesting changing the reinforcement learning algorithm. While that is a possibility, empirically, the cross-entropy method seems to be the most suitable. The assignment is therefore only about using a specific RL technique to discover combinatorial structures.
|
| Dependencies |
Requires knowledge of Adam Wagner’s work (
paper,
code).
The code requires Keras + Tensorflow + Python.
|
| Variants |
We have given two examples from simple combinatorics and two from Geometry, but one can easily find similar examples from Graph theory such as the
degree–diameter problem.
|