Benchmarking Amazon and OpenAI, Google (GOOGL.US) officially launches multiple AI tools in succession: the multimodal model Gemini Embedding 2 is now live.

robot
Abstract generation in progress

Google (GOOGL.US) announced on Tuesday the release of its first multimodal AI model, Gemini Embedding 2. This latest model from the tech giant can map text, images, videos, audio, and documents into a unified embedding space.

In a blog post, Google stated: “Gemini Embedding 2 maps text, images, videos, audio, and documents into a single embedding space and can capture semantic intent in over 100 languages.” “This simplifies complex processing workflows and enhances various multimodal downstream tasks—from retrieval-augmented generation (RAG) and semantic search to sentiment analysis and data clustering.”

As the newest member of the Gemini AI model series, the model supports up to 8,192 text input tokens; can process up to six images per request, supporting PNG and JPEG formats; handles videos up to 120 seconds long, supporting MP4 and MOV formats; can directly ingest and embed audio data without transcription; and can embed PDF documents up to six pages long.

Google added: “Gemini Embedding 2 is more than just an improvement over traditional models.” When compared to Amazon (AMZN.US), Voyage models, and other Google models, Google said: “It sets a new performance standard for multimodal deep learning, introduces powerful speech capabilities, and surpasses leading models in text, image, and video tasks. This measurable performance boost and unique multimodal coverage enable developers to access all the tools needed to meet their diverse embedding requirements.”

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin