voyage-multilingual-2: Multilingual Embedding Model

TL;DR – We are excited to officially release voyage-multilingual-2, optimized for multilingual retrieval and retrieval-augmented generation (RAG). It outperforms alternatives, such as OpenAI v3 large and Cohere multilingual v3, on most languages, including major languages like French, German, Japanese, Spanish, and Korean. On average, voyage-multilingual-2 outperforms the second best performing model by 5.6%. Notably, voyage-multilingual-2 continues to perform well on English. Moreover, voyage-multilingual-2 supports a large 32K context length.

In addition to improving retrieval accuracy for expertise-intensive domains, such as code, law, and finance, embedding models also need to have great multilingual support in a globally connected world. Today, we are excited to officially release voyage-multilingual-2, optimized for multilingual retrieval and retrieval-augmented generation (RAG).

Quantitative Evaluation

Datasets. We evaluate voyage-multilingual-2 on over 85 datasets that we collected from various sources, covering 27 languages, including English, French, German, Japanese, Spanish, Korean, Bengali, Portuguese, Russian, etc. Each of the first 6 languages has multiple datasets. The other languages involve one dataset each and are grouped into an OTHER category. All the evaluation datasets are listed in this detailed spreadsheet.

Models and Metrics. We evaluate voyage-multilingual-2 and three other baselines—OpenAI v3 large (text-embedding-3-large), Multilingual E5 (infloat/multilingual-e5-large), and Cohere multilingual v3 (embed-multilingual-v3.0). Given a query, we retrieve the top-10 documents based on cosine similarities and report the normalized discounted cumulative gain (NDCG@10), a standard metric for retrieval quality and a variant of the recall.

Results. As shown in the radar chart below or in this detailed spreadsheet, voyage-multilingual-2 outperforms its alternatives on all language categories evaluated, and on average, is 5.6% better than the second best performing model. Notably, voyage-multilingual-2 continues to perform well on English; in contrast, Cohere’s multilingual model and multilingual E5 retrieval quality in English is substantially below other evaluated languages.

Try voyage-multilingual-2

Now, global citizens and multilingual builders can further enhance their Gen AI applications with superior retrieval accuracy with voyage-multilingual-2. If you have previously used other Voyage embeddings, you just need to specify voyage-multilingual-2 as the model parameter (for both the corpus and queries). Head over to our docs to learn more.

If you’re interested in early access to more upcoming domain-specific or finetuning embeddings, we’d love to hear from you and please email [email protected]. Follow us on X (Twitter) and/or LinkedIn for more updates!