RollArt: Accelerating Agentic RL Training with Disaggregated Infrastructure

Published:Dec 27, 2025 11:14
1 min read
ArXiv

Analysis

This paper addresses the challenge of efficiently training agentic Reinforcement Learning (RL) models, which are computationally demanding and heterogeneous. It proposes RollArc, a distributed system designed to optimize throughput on disaggregated infrastructure. The core contribution lies in its three principles: hardware-affinity workload mapping, fine-grained asynchrony, and statefulness-aware computation. The paper's significance is in providing a practical solution for scaling agentic RL training, which is crucial for enabling LLMs to perform autonomous decision-making. The results demonstrate significant training time reduction and scalability, validated by training a large MoE model on a large GPU cluster.

Reference

RollArc effectively improves training throughput and achieves 1.35-2.05x end-to-end training time reduction compared to monolithic and synchronous baselines.