Text-Routed MoE Model for Multi-Modal Sentiment Analysis
Research Paper#Multi-modal Sentiment Analysis, Mixture-of-Experts, Temporal Alignment, MLLM🔬 Research|Analyzed: Jan 3, 2026 19:39•
Published: Dec 28, 2025 01:58
•1 min read
•ArXivAnalysis
This paper introduces TEXT, a novel model for Multi-modal Sentiment Analysis (MSA) that leverages explanations from Multi-modal Large Language Models (MLLMs) and incorporates temporal alignment. The key contributions are the use of explanations, a temporal alignment block (combining Mamba and temporal cross-attention), and a text-routed sparse mixture-of-experts with gate fusion. The paper claims state-of-the-art performance across multiple datasets, demonstrating the effectiveness of the proposed approach.
Key Takeaways
Reference / Citation
View Original"TEXT achieves the best performance cross four datasets among all tested models, including three recently proposed approaches and three MLLMs."