Hallucination-Resistant Decoding for LVLMs
Published:Dec 29, 2025 13:23
•1 min read
•ArXiv
Analysis
This paper addresses a critical problem in Large Vision-Language Models (LVLMs): hallucination. It proposes a novel, training-free decoding framework, CoFi-Dec, that leverages generative self-feedback and coarse-to-fine visual conditioning to mitigate this issue. The approach is model-agnostic and demonstrates significant improvements on hallucination-focused benchmarks, making it a valuable contribution to the field. The use of a Wasserstein-based fusion mechanism for aligning predictions is particularly interesting.
Key Takeaways
- •Proposes CoFi-Dec, a training-free decoding framework to reduce hallucinations in LVLMs.
- •Employs coarse-to-fine visual conditioning and generative self-feedback.
- •Uses a Wasserstein-based fusion mechanism for prediction alignment.
- •Demonstrates improved performance on hallucination-focused benchmarks.
- •Model-agnostic and can be applied to a wide range of LVLMs.
Reference
“CoFi-Dec substantially reduces both entity-level and semantic-level hallucinations, outperforming existing decoding strategies.”