Google has reintroduced a variant of its Gemma model family, now available under an open weight Apache 2 license. The new release, named DiffusionGemma, builds upon experimental work conducted last May when a similar Gemini Diffusion model achieved a generation speed of 857 tokens per second. Although the initial preview was discontinued without further announcement, the research has returned as the google/diffusiongemma-26B-A4B-it model. NVIDIA is currently hosting this model for free on its NIM cloud API, allowing developers to access the technology without immediate infrastructure costs. Testing via this API demonstrated a generation rate of at least 500 tokens per second, with a specific benchmark returning 2,409 tokens in 4.4 seconds. This performance metric significantly outpaces standard autoregressive models that rely on sequential token prediction.
The significance of this release lies in the shift from proprietary experimentation to open utility within the Apache 2 framework. By making the model weights publicly available, Google enables independent verification and deployment without the restrictions often placed on commercial APIs. The integration with NVIDIA’s infrastructure further lowers the barrier to entry for researchers and developers seeking high-speed generation capabilities. While the model name suggests a diffusion-based architecture, the practical outcome is a text generation tool that competes on raw speed metrics traditionally reserved for more compute-intensive systems. This development challenges the prevailing assumption that high-speed text generation requires exclusive access to Google’s proprietary Gemini services.
* The google/diffusiongemma-26B-A4B-it model is now open source under an Apache 2 license, removing previous barriers to access.
* Independent benchmarks confirm generation speeds exceeding 500 tokens per second when hosted on NVIDIA’s NIM cloud API.
* This release transitions experimental diffusion research into a practical, deployable tool for developers and researchers.
Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.



