GateFusion：基于层次门控跨模态融合的主动说话人检测

Research #Multimodal 🔬 Research|分析: 2026年1月10日 10:18•

发布: 2025年12月17日 18:56

•

1分で読める

分析

这项研究使用一种新的融合技术探索主动说话人检测，这可能提高了视听分析的准确性。这种分层门控跨模态融合方法代表了在这个特定任务中处理多模态数据的有趣进展。

引用 / 来源

"The paper introduces GateFusion, a hierarchical gated cross-modal fusion approach for active speaker detection."

ArXiv2025年12月17日 18:56

* 根据版权法第32条进行合法引用。

Deep Dive into Multi-View Foundation Models

Self-Resampling Boosts Video Diffusion Models