Unsupervised Discovery of Reasoning Behaviors in LLMs

Paper#llm🔬 Research|Analyzed: Jan 3, 2026 18:22
Published: Dec 30, 2025 05:09
1 min read
ArXiv

Analysis

This paper introduces an unsupervised method (RISE) to analyze and control reasoning behaviors in large language models (LLMs). It moves beyond human-defined concepts by using sparse auto-encoders to discover interpretable reasoning vectors within the activation space. The ability to identify and manipulate these vectors allows for controlling specific reasoning behaviors, such as reflection and confidence, without retraining the model. This is significant because it provides a new approach to understanding and influencing the internal reasoning processes of LLMs, potentially leading to more controllable and reliable AI systems.
Reference / Citation
View Original
"Targeted interventions on SAE-derived vectors can controllably amplify or suppress specific reasoning behaviors, altering inference trajectories without retraining."
A
ArXivDec 30, 2025 05:09
* Cited for critical analysis under Article 32.