OptRot: Data-Free Rotations Improve LLM Quantization
Analysis
Key Takeaways
- •OptRot is a data-free method for mitigating weight outliers in LLMs.
- •OptRot improves weight quantization performance, outperforming existing methods.
- •OptRot+ incorporates activation covariance for further performance gains.
- •The paper highlights trade-offs between weight and activation quantization in different settings (W4A4 vs W4A8).
“OptRot outperforms both Hadamard rotations and more expensive, data-dependent methods like SpinQuant and OSTQuant for weight quantization.”