Search:
Match:
1 results

Mobile-Efficient Speech Emotion Recognition with Distilled HuBERT

Published:Dec 29, 2025 12:53
1 min read
ArXiv

Analysis

This paper addresses the challenge of deploying Speech Emotion Recognition (SER) on mobile devices by proposing a mobile-efficient system based on DistilHuBERT. The authors demonstrate a significant reduction in model size while maintaining competitive accuracy, making it suitable for resource-constrained environments. The cross-corpus validation and analysis of performance on different datasets (IEMOCAP, CREMA-D, RAVDESS) provide valuable insights into the model's generalization capabilities and limitations, particularly regarding the impact of acted emotions.
Reference

The model achieves an Unweighted Accuracy of 61.4% with a quantized model footprint of only 23 MB, representing approximately 91% of the Unweighted Accuracy of a full-scale baseline.