Context Management for Long-Horizon SWE-Agents
Analysis
This paper addresses the critical challenge of context management in long-horizon software engineering tasks performed by LLM-based agents. The core contribution is CAT, a novel context management paradigm that proactively compresses historical trajectories into actionable summaries. This is a significant advancement because it tackles the issues of context explosion and semantic drift, which are major bottlenecks for agent performance in complex, long-running interactions. The proposed CAT-GENERATOR framework and SWE-Compressor model provide a concrete implementation and demonstrate improved performance on the SWE-Bench-Verified benchmark.
Key Takeaways
- •Proposes CAT, a new context management paradigm for long-horizon SWE-agents.
- •Introduces CAT-GENERATOR, a trajectory-level supervision framework.
- •Demonstrates significant performance improvements on SWE-Bench-Verified compared to existing methods.
- •Addresses context explosion and semantic drift issues in long-running interactions.
“SWE-Compressor reaches a 57.6% solved rate and significantly outperforms ReAct-based agents and static compression baselines, while maintaining stable and scalable long-horizon reasoning under a bounded context budget.”