RL-CSL: A Combinatorial Optimization Method Using Reinforcement Learning and Contrastive Self-Supervised Learning


Reinforcement learning-based methods have shown great potential in solving combinatorial optimization problems. However, the related research has not been mature in terms of both models and training methods. This paper proposes a method based on reinforcement learning and contrastive self-supervised learning. To be specific, the proposed method uses an attention model to learn a policy for generating solutions and combines a contrastive self-supervised learning model to learn the attention encoder in the way of node-by-node. Correspondingly, a two-phase learning method, including node-wise learning and solution-wise learning, is adopted to train the attention model and the contrastive self-supervised model jointly and collaboratively. The performance of the proposed method has been verified by numerical experiments on various combinatorial optimization problems.

IEEE Transactions on Emerging Topics in Computational Intelligence