Search:
Match:
1 results

Analysis

This paper introduces TEXT, a novel model for Multi-modal Sentiment Analysis (MSA) that leverages explanations from Multi-modal Large Language Models (MLLMs) and incorporates temporal alignment. The key contributions are the use of explanations, a temporal alignment block (combining Mamba and temporal cross-attention), and a text-routed sparse mixture-of-experts with gate fusion. The paper claims state-of-the-art performance across multiple datasets, demonstrating the effectiveness of the proposed approach.
Reference

TEXT achieves the best performance cross four datasets among all tested models, including three recently proposed approaches and three MLLMs.