Width Pruning in Llama-3: Enhancing Instruction Following by Reducing Factual Knowledge
Published:Dec 27, 2025 18:09
•1 min read
•ArXiv
Analysis
This paper challenges the common understanding of model pruning by demonstrating that width pruning, guided by the Maximum Absolute Weight (MAW) criterion, can selectively improve instruction-following capabilities while degrading performance on tasks requiring factual knowledge. This suggests that pruning can be used to trade off knowledge for improved alignment and truthfulness, offering a novel perspective on model optimization and alignment.
Key Takeaways
- •Width pruning, guided by MAW, reveals a dichotomy: knowledge degrades while instruction-following improves.
- •Expansion ratio is a critical architectural parameter that modulates cognitive capabilities.
- •Inverse correlation between factual knowledge and truthfulness is observed.
- •Pruned configurations offer energy efficiency gains but may impact latency in single-request scenarios.
Reference
“Instruction-following capabilities improve substantially (+46% to +75% in IFEval for Llama-3.2-1B and 3B models).”