Overview


Introduction
RLSolver: GPU-based Massively Parallel Environments for Large-Scale Combinatorial Optimization (CO) Problems Using Reinforcement Learning
RLSolver is an open-source RL-based solver for combinatorial optimization (CO) problems. RLSolver uses reinforcement learning (RL) or machine learning (ML) to automate the search process of combinatorial optimizations. It uses auto-regressive neural networks, auto-regressive graph neural networks (GNNs), or more powerful neural networks (e.g., transformer) as the policy network. With the help of GPUs with thousands of CUDA cores and tensor cores, the sampling speed and solution quality are improved.
Two major challenges:
- Scalable RL/ML algorithms are highly favorable.
- GPU-based simulation is the key to address the sampling bottleneck.
RLSolver consists of three key components:
- Environments: GPU-based massively parallel environments for CO problems.
- Agents: RL algorithms such as REINFORCE and DQN.
- Problems: Graph maxcut, Ising Model, and more.
We design two tasks to promote GPU-powered RL optimization:
- Graph Maxcut Using RL Agents
- Ising Model Ground-State Estimation via MCMC-based RL
We welcome researchers, students, and practitioners from optimization, operations research (OR), RL/ML, or GPU computing communities to participate!
Tasks
Each team can choose to participate in one or both tasks. Awards and recognitions will be given for each task.
Task I: Graph Maxcut
Develop reinforcement learning agents to solve Max-Cut problems on large graphs. Agents must be trained in a distribution-wise fashion across families of graphs, utilizing GPU-based environments for sampling.
Dataset
Synthetic graphs generated from the following distributions:
- BA (Barabási–Albert)
- ER (Erdős–Rényi)
- PL (Power-Law)
Each graph file follows:
n m # number of nodes and edges
u v w # edge from node u to v with weight w
Goal
Maximize the cut value using RL agents with multiple training environments.
Task II: Ising Model
Train RL agents to find low-energy states of 2D Ising models using GPU-accelerated MCMC or spin-flip environments. The environment simulates spin lattices with interaction energy, and agents learn policies to flip spins efficiently.
Dataset
Generated 2D Ising model grids of various sizes (e.g., 16×16, 32×32). Participants may use or extend the provided PyTorch/CUDA simulator for custom training.
Goal
Minimize the total system energy across random initial configurations.
Contact
Contact email: rlsolvercontest@outlook.com
Contestants can communicate any questions on
- Discord.
- QQ Group: 922523057
- WeChat Group:
