Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context, Best Sub-100M Retrieval Quality

Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context, Best Sub-100M Retrieval Quality In this post, we introduce two…

By AI Maestro May 14, 2026 1 min read
Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context, Best Sub-100M Retrieval Quality

Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context, Best Sub-100M Retrieval Quality

In this post, we introduce two new multilingual embedding models released by IBM:

  • granite-embedding-97m-multilingual-r2: A 97 million parameter compact model with 384-dimensional embeddings that delivers strong retrieval quality for its size.
  • granite-embedding-311m-multilingual-r2: A 311 million parameter full-size model with 768-dimensional embeddings, Matryoshka dimension support, and top-tier multilingual retrieval quality.

Both models are available under the Apache 2.0 license and work seamlessly with popular libraries like sentence-transformers and transformers. They handle context lengths up to 32,768 tokens (a 64x increase over their predecessor) and support retrieval across 52 languages as well as programming code. The models were trained on a mix of IBM-curated datasets, publicly available data, and internal synthetic data, ensuring responsible use and enterprise deployment.

Key Takeaways

  • New Models: Two new multilingual embedding models released by IBM, one compact (97M parameters) and one full-size (311M parameters).
  • High Retrieval Quality: The 97M model scores 60.3 on Multilingual MTEB Retrieval, the highest for any open multilingual embedding model under 100M parameters.
  • Enterprise-Ready: Both models are designed with enterprise use cases in mind, providing support for multiple languages and code retrieval out of the box.
Scroll to Top