Width Pruning in Llama-3: Enhancing Instruction Following by Reducing Factual Knowledge
Analysis
Key Takeaways
- •Width pruning, guided by MAW, reveals a dichotomy: knowledge degrades while instruction-following improves.
- •Expansion ratio is a critical architectural parameter that modulates cognitive capabilities.
- •Inverse correlation between factual knowledge and truthfulness is observed.
- •Pruned configurations offer energy efficiency gains but may impact latency in single-request scenarios.
“Instruction-following capabilities improve substantially (+46% to +75% in IFEval for Llama-3.2-1B and 3B models).”