IEEE Logo Columbia Logo Idea Logo

Media Partners

Wilmott Logo Paris Machine Learning Logo

Thanks to the AI4Finance Foundation Open-Source Community support。

Please find the starter kit here!

Introduction

RLSolver: GPU-based Massively Parallel Environments for Large-Scale Combinatorial Optimization (CO) Problems Using Reinforcement Learning

RLSolver aims to showcase the effectiveness of GPU-based massively parallel environments for solving large-scale combinatorial optimization problems with reinforcement learning (RL). With thousands of CUDA cores and tensor cores, the sampling speed is improved by 2–3 orders of magnitude over traditional CPU-based environments, which significantly enhances convergence speed and solution quality.

RLSolver consists of three key components:

  • Environments: GPU-based massively parallel simulation for CO problems.
  • Agents: RL solvers like REINFORCE, DQN, etc.
  • Problems: Graph Max-Cut, Ising Model, and more.

We design two competition tasks to promote GPU-powered RL optimization:

  1. Graph Max-Cut with Parallel RL Agents
  2. Ising Model Ground-State Estimation via MCMC-based RL

We welcome researchers, students, and practitioners from optimization, RL, or GPU computing communities to participate!

Tasks

Each team can choose to participate in one or both tasks. Awards and recognitions will be given for each task.

Task I: Graph Max-Cut

Develop reinforcement learning agents to solve Max-Cut problems on large graphs. Agents must be trained in a distribution-wise fashion across families of graphs, utilizing GPU-based environments for sampling.

Dataset

Synthetic graphs generated from the following distributions:

  • BA (Barabási–Albert)
  • ER (Erdős–Rényi)
  • PL (Power-Law)

Each graph file follows:

n m           # number of nodes and edges  
u v w         # edge from node u to v with weight w  

Goal

Maximize the cut value using RL agents with batched training environments.


Task II: Ising Model

Train RL agents to find low-energy states of 2D Ising models using GPU-accelerated MCMC or spin-flip environments. The environment simulates spin lattices with interaction energy, and agents learn policies to flip spins efficiently.

Dataset

Generated 2D Ising model grids of various sizes (e.g., 16×16, 32×32). Participants may use or extend the provided PyTorch/CUDA simulator for custom training.

Goal

Minimize the total system energy across random initial configurations.


Contact

Contact email: rlsolvercontest@outlook.com

Contestants can communicate any questions on