SenseNova-MARS: Agentic Reasoning with Tools via RL

Research Paper#Vision-Language Models, Agentic Reasoning, Reinforcement Learning🔬 Research|Analyzed: Jan 3, 2026 15:38
Published: Dec 30, 2025 16:31
1 min read
ArXiv

Analysis

This paper introduces SenseNova-MARS, a novel framework that enhances Vision-Language Models (VLMs) with agentic reasoning and tool use capabilities, specifically focusing on integrating search and image manipulation tools. The use of reinforcement learning (RL) and the introduction of the HR-MMSearch benchmark are key contributions. The paper claims state-of-the-art performance, surpassing even proprietary models on certain benchmarks, which is significant. The release of code, models, and datasets further promotes reproducibility and research in this area.
Reference / Citation
View Original
"SenseNova-MARS achieves state-of-the-art performance on open-source search and fine-grained image understanding benchmarks. Specifically, on search-oriented benchmarks, SenseNova-MARS-8B scores 67.84 on MMSearch and 41.64 on HR-MMSearch, surpassing proprietary models such as Gemini-3-Flash and GPT-5."
A
ArXivDec 30, 2025 16:31
* Cited for critical analysis under Article 32.