Reinforcement Learning for Combinatorial Optimization
1. Encoder의 변화에 따라
Pointer Network
- Neural Combinatorial Optimization with Reinforcement Learning
Self-attention
- Learning Heuristics for the TSP by Policy Gradient
- Attention, Learn to Solve Routing Problems!
- Dynamic states during decoding
- Reinforcement Learning for Solving the Vehicle Routing Problem
Graph Neural Network
- Reinforcement Learning for Solving the Vehicle Routing Problem
- off-policy
- Explortation
- MCTS