Improving CDVQA with Decision-Ambiguity-guided Reinforcement Fine-Tuning

Analysis

This paper addresses the challenge of decision ambiguity in Change Detection Visual Question Answering (CDVQA), where models struggle to distinguish between the correct answer and strong distractors. The authors propose a novel reinforcement learning framework, DARFT, to specifically address this issue by focusing on Decision-Ambiguous Samples (DAS). This is a valuable contribution because it moves beyond simply improving overall accuracy and targets a specific failure mode, potentially leading to more robust and reliable CDVQA models, especially in few-shot settings.
Reference / Citation
View Original
"DARFT suppresses strong distractors and sharpens decision boundaries without additional supervision."
A
ArXivDec 31, 2025 03:28
* Cited for critical analysis under Article 32.