In comparison to Amazon.com, Inc., OpenAI, and Alphabet Inc. Class C (GOOGL.US), a series of AI tools have recently been launched including the multimodal model Gemini Embedding 2.

date
09:27 11/03/2026
avatar
GMT Eight
Google has launched its first multimodal artificial intelligence (AI) model, Gemini Embedding 2, the latest model released by the tech giant that can map text, images, videos, audio, and documents to a unified embedding space.
Alphabet Inc. Class C (GOOGL.US) released its first multimodal artificial intelligence (AI) model Gemini Embedding 2 on Tuesday. This is the latest model introduced by the tech giant, which can map text, images, videos, audio, and documents into a unified embedding space. Alphabet Inc. Class C stated in a blog post: "Gemini Embedding 2 maps text, images, videos, audio, and documents into a unified embedding space and can capture semantic intent in over 100 languages." "This simplifies complex processing workflows and enhances a variety of multimodal downstream tasks - from retrieval and semantic search, to sentiment analysis and data clustering." As the newest member of the Gemini series AI models, this model supports up to 8192 text input tokens; can process up to 6 images per request, supporting PNG and JPEG formats; can handle videos up to 120 seconds long, supporting MP4 and MOV formats; is able to directly ingest and embed audio data without transcription; and can directly embed PDF documents up to 6 pages long. Alphabet Inc. Class C added: "Gemini Embedding 2 is not just an improvement on traditional models." When comparing it with models from Amazon.com, Inc. (AMZN.US), the Voyage model, and other models from Alphabet Inc. Class C, they stated: "It sets a new performance standard for multimodal deep learning, introducing powerful speech capabilities and surpassing leading models in text, image, and video tasks. This measurable performance improvement and unique multimodal coverage capability give developers all the tools they need to meet their diverse embedding needs."