Multi-Envelope DBF for LLM Quantization
Analysis
This paper addresses the limitations of Double Binary Factorization (DBF) for extreme low-bit quantization of Large Language Models (LLMs). DBF, while efficient, suffers from performance saturation due to restrictive scaling parameters. The proposed Multi-envelope DBF (MDBF) improves upon DBF by introducing a rank-$l$ envelope, allowing for better magnitude expressiveness while maintaining a binary carrier and deployment-friendly inference. The paper demonstrates improved perplexity and accuracy on LLaMA and Qwen models.
Key Takeaways
“MDBF enhances perplexity and zero-shot accuracy over previous binary formats at matched bits per weight while preserving the same deployment-friendly inference primitive.”