Reasoning-Answer Misalignment in Multilingual LLMs
Analysis
Key Takeaways
- •LLMs can achieve high accuracy while exhibiting flawed reasoning.
- •Reasoning-answer misalignment is more prevalent in non-Latin scripts.
- •Evidential errors and illogical reasoning steps are primary causes of failure.
- •Current multilingual evaluation practices are insufficient for assessing reasoning.
“Reasoning traces in non-Latin scripts show at least twice as much misalignment between their reasoning and conclusions than those in Latin scripts.”