Overview



Media Partners
Thanks to the AI4Finance Foundation Open-Source Community support。
Please find the starter kit here!
Introduction
RLSolver: GPU-based Massively Parallel Environments for Large-Scale Combinatorial Optimization (CO) Problems Using Reinforcement Learning
RLSolver aims to showcase the effectiveness of GPU-based massively parallel environments for solving large-scale combinatorial optimization problems with reinforcement learning (RL). With thousands of CUDA cores and tensor cores, the sampling speed is improved by 2–3 orders of magnitude over traditional CPU-based environments, which significantly enhances convergence speed and solution quality.
RLSolver consists of three key components:
- Environments: GPU-based massively parallel simulation for CO problems.
- Agents: RL solvers like REINFORCE, DQN, etc.
- Problems: Graph Max-Cut, Ising Model, and more.
We design two competition tasks to promote GPU-powered RL optimization:
- Graph Max-Cut with Parallel RL Agents
- Ising Model Ground-State Estimation via MCMC-based RL
We welcome researchers, students, and practitioners from optimization, RL, or GPU computing communities to participate!
Tasks
Each team can choose to participate in one or both tasks. Awards and recognitions will be given for each task.
Task I: Graph Max-Cut
Develop reinforcement learning agents to solve Max-Cut problems on large graphs. Agents must be trained in a distribution-wise fashion across families of graphs, utilizing GPU-based environments for sampling.
Dataset
Synthetic graphs generated from the following distributions:
- BA (Barabási–Albert)
- ER (Erdős–Rényi)
- PL (Power-Law)
Each graph file follows:
n m # number of nodes and edges
u v w # edge from node u to v with weight w
Goal
Maximize the cut value using RL agents with batched training environments.
Task II: Ising Model
Train RL agents to find low-energy states of 2D Ising models using GPU-accelerated MCMC or spin-flip environments. The environment simulates spin lattices with interaction energy, and agents learn policies to flip spins efficiently.
Dataset
Generated 2D Ising model grids of various sizes (e.g., 16×16, 32×32). Participants may use or extend the provided PyTorch/CUDA simulator for custom training.
Goal
Minimize the total system energy across random initial configurations.
Contact
Contact email: rlsolvercontest@outlook.com
Contestants can communicate any questions on
- Discord.
- WeChat Group: