Search:
Match:
1 results

Analysis

This paper critically examines the Chain-of-Continuous-Thought (COCONUT) method in large language models (LLMs), revealing that it relies on shortcuts and dataset artifacts rather than genuine reasoning. The study uses steering and shortcut experiments to demonstrate COCONUT's weaknesses, positioning it as a mechanism that generates plausible traces to mask shortcut dependence. This challenges the claims of improved efficiency and stability compared to explicit Chain-of-Thought (CoT) while maintaining performance.
Reference

COCONUT consistently exploits dataset artifacts, inflating benchmark performance without true reasoning.