Semantic Codebooks Improve Neural Speech Compression
Research Paper#Speech Compression, Neural Codecs, Semantic Understanding🔬 Research|Analyzed: Jan 4, 2026 00:20•
Published: Dec 25, 2025 12:49
•1 min read
•ArXivAnalysis
This paper introduces SemDAC, a novel neural audio codec that leverages semantic codebooks derived from HuBERT features to improve speech compression efficiency and recognition accuracy. The core idea is to prioritize semantic information (phonetic content) in the initial quantization stage, allowing for more efficient use of acoustic codebooks and leading to better performance at lower bitrates compared to existing methods like DAC. The paper's significance lies in its demonstration of how incorporating semantic understanding can significantly enhance speech compression, potentially benefiting applications like speech recognition and low-bandwidth communication.
Key Takeaways
- •SemDAC is a semantic-aware neural audio codec.
- •It uses semantic codebooks derived from HuBERT features.
- •It outperforms DAC in perceptual metrics and WER at lower bitrates.
- •The approach demonstrates the effectiveness of semantic priors in speech compression.
Reference / Citation
View Original"SemDAC outperforms DAC across perceptual metrics and achieves lower WER when running Whisper on reconstructed speech, all while operating at substantially lower bitrates (e.g., 0.95 kbps vs. 2.5 kbps for DAC)."