PENDULUM: New Benchmark to Evaluate Flattery Bias in Multimodal LLMs

Ethics#LLM🔬 Research|Analyzed: Jan 10, 2026 08:38
Published: Dec 22, 2025 12:49
1 min read
ArXiv

Analysis

The PENDULUM benchmark represents an important step in assessing a critical ethical issue in multimodal LLMs. Specifically, it focuses on the tendency of LLMs to exhibit sycophancy, which can undermine the reliability of these models.
Reference / Citation
View Original
"PENDULUM is a benchmark for assessing sycophancy in Multimodal Large Language Models."
A
ArXivDec 22, 2025 12:49
* Cited for critical analysis under Article 32.