Lawrence Jengar
Apr 02, 2026 16:59
NVIDIA pronounces full assist for Google’s Gemma 4 multimodal AI fashions throughout Blackwell, Jetson, and RTX platforms, enabling enterprise-grade native deployment.

NVIDIA has rolled out complete assist for Google’s newly launched Gemma 4 mannequin household, enabling deployment throughout its whole {hardware} ecosystem from information heart Blackwell GPUs right down to Jetson edge gadgets. The collaboration, introduced April 2, 2026, positions each corporations to seize rising enterprise demand for safe, on-premises AI inference.
The Gemma 4 bundle consists of 4 fashions—a 31B dense transformer, a 26B mixture-of-experts variant with 128 consultants, and two smaller E4B and E2B fashions designed particularly for cell and edge deployment. All fashions assist context home windows as much as 256K tokens and deal with multimodal inputs together with textual content, audio, imaginative and prescient, and video.
{Hardware} Flexibility Drives Enterprise Enchantment
What makes this launch notable for enterprise consumers: each Gemma 4 mannequin suits on a single H100 GPU. The flagship 31B mannequin runs on DGX Spark’s 128GB unified reminiscence, whereas the smaller E2B variant (2.3B efficient parameters) targets Jetson Orin Nano for robotics and industrial automation.
NVIDIA partnered with vLLM, Ollama, and llama.cpp to optimize native deployment. Unsloth gives day-one quantized mannequin assist via Unsloth Studio. An NVFP4 quantized checkpoint for Gemma 4-31B will comply with shortly for Blackwell builders.
The On-Prem Safety Play
The timing is not unintended. Healthcare and monetary providers companies more and more demand AI capabilities with out sending delicate information to cloud suppliers. Gemma 4’s Apache 2.0 license—totally open-source with industrial use permitted—removes licensing friction that plagues proprietary options.
Enterprise builders can entry the Gemma 4 31B mannequin via NVIDIA’s hosted NIM API for prototyping, then deploy self-hosted NIM microservices for manufacturing workloads beneath an NVIDIA Enterprise License.
Advantageous-Tuning With out Conversion Complications
NVIDIA’s NeMo Automodel library helps day-zero fine-tuning instantly from Hugging Face checkpoints. Builders can apply supervised fine-tuning and LoRA strategies with out mannequin conversion—a workflow enchancment that cuts deployment timelines for customized functions.
The fashions are reside now on Hugging Face with BF16 checkpoints. Builders can check Gemma 4 31B free via NVIDIA’s API catalog at construct.nvidia.com earlier than committing {hardware} sources.
Picture supply: Shutterstock
