Google's Gemini Embedding 2: A Leap Forward in Multimodal AI
research#embeddings📝 Blog|Analyzed: Mar 12, 2026 02:00•
Published: Mar 12, 2026 09:49
•1 min read
•InfoQ中国Analysis
Google has launched Gemini Embedding 2, a groundbreaking multimodal embedding model built on the Gemini architecture! This model offers native support for interleaved input, allowing it to understand the relationships between different media types in a single request. Its advanced capabilities promise to revolutionize tasks like RAG and semantic search.
Key Takeaways
- •Gemini Embedding 2 natively supports multiple data types including text, images, video, audio and documents.
- •It supports interleaved input, enabling the model to process mixed media like images with text descriptions.
- •The model uses Matryoshka Representation Learning (MRL) technology for dynamic vector dimension adjustment.
Reference / Citation
View Original"Gemini Embedding 2 can map text, images, videos, audio, and documents to the same unified embedding space, thereby supporting cross-media semantic understanding and retrieval."