Overview

Introduction

RLSolver: GPU-based Massively Parallel Environments for Large-Scale Combinatorial Optimization (CO) Problems Using Reinforcement Learning

RLSolver is an open-source RL-based solver for combinatorial optimization (CO) problems. RLSolver uses reinforcement learning (RL) or machine learning (ML) to automate the search process of combinatorial optimizations. It uses auto-regressive neural networks, auto-regressive graph neural networks (GNNs), or more powerful neural networks (e.g., transformer) as the policy network. With the help of GPUs with thousands of CUDA cores and tensor cores, the sampling speed and solution quality are improved.

Two major challenges:

Scalable RL/ML algorithms are highly favorable.
GPU-based simulation is the key to address the sampling bottleneck.

RLSolver consists of three key components:

Environments: GPU-based massively parallel environments for CO problems.
Agents: RL algorithms such as REINFORCE and DQN.
Problems: Graph maxcut, Ising Model, and more.

We design two tasks to promote GPU-powered RL optimization:

Graph Maxcut Using RL Agents
Ising Model Ground-State Estimation via MCMC-based RL

We welcome researchers, students, and practitioners from optimization, operations research (OR), RL/ML, or GPU computing communities to participate!

Tasks

Each team can choose to participate in one or both tasks. Awards and recognitions will be given for each task.

Task I: Graph Maxcut

Develop reinforcement learning agents to solve Max-Cut problems on large graphs. Agents must be trained in a distribution-wise fashion across families of graphs, utilizing GPU-based environments for sampling.

Dataset

Synthetic graphs generated from the following distributions:

BA (Barabási–Albert)
ER (Erdős–Rényi)
PL (Power-Law)

Each graph file follows:

n m           # number of nodes and edges  
u v w         # edge from node u to v with weight w  

Goal

Maximize the cut value using RL agents with multiple training environments.

Task II: Ising Model

Train RL agents to find low-energy states of 2D Ising models using GPU-accelerated MCMC or spin-flip environments. The environment simulates spin lattices with interaction energy, and agents learn policies to flip spins efficiently.

Dataset

Generated 2D Ising model grids of various sizes (e.g., 16×16, 32×32). Participants may use or extend the provided PyTorch/CUDA simulator for custom training.

Goal

Minimize the total system energy across random initial configurations.

Contact

Contact email: rlsolvercontest@outlook.com

Contestants can communicate any questions on

Discord.
QQ Group: 922523057
WeChat Group: