Nvidia & Google Cloud: Using Gemini AI for Regulated Sectors

By Kitty Wheeler

May 29, 2025

undefined mins

Share this article

Prioritise Us on Google

Share this article

Prioritise Us on Google

Nvidia and Google gain momentum with Blackwell and Gemini announcement

Nvidia and Google Cloud are expanding their partnership to deliver Google’s Gemini AI models on Blackwell GPUs across cloud and on-premises environments

As AI capabilities keep on accelerating, so do the expectations and demands that come with it.

Now, the pressure is increasing to serve enterprises with stringent data governance requirements, particularly as Gen AI adoption surges across regulated sectors.

Traditional cloud-based AI services often can no longer accommodate organisations such as healthcare, financial services and government that need to maintain data sovereignty or operate within air-gapped environments due to compliance mandates.

In response, Nvidia and Google Cloud have expanded their collaboration to deliver Google’s Gemini AI models on Nvidia’s Blackwell GPU architecture across cloud and on-premises environments.

The capabilities of Nvidia and Google Cloud combined

The partnership addresses deployment requirements for regulated industries that require data to remain within their own infrastructure.

Google Cloud became the first cloud service provider to offer Nvidia’s HGX B200 and GB200 NVL72 processors through its A4 and A4X virtual machines – and since Nvidia designs specialised chips for AI workloads that can process multiple calculations simultaneously – the collaboration now extends beyond infrastructure provision to include engineering optimisation of the computing stack that supports AI applications.

Nvidia CEO, Jensen Huang

Both companies have already contributed to open-source software projects including:

JAX, a machine learning (ML) framework
OpenXLA, a compiler for linear algebra operations
MaxText, a text processing library
llm-d, a language model deployment tool

Furthermore, Nvidia’s AI software suite, including NeMo for model development, TensorRT-LLM for inference optimisation, Dynamo for compilation and NIM microservices for deployment, integrates with Google Cloud services. These include Vertex AI, Google’s ML platform, Google Kubernetes Engine for container orchestration and Cloud Run for serverless computing.

Google Cloud launching Blackwell-powered virtual machines

Google Cloud’s A4 virtual machines, accelerated by Nvidia HGX B200 processors, are now available for commercial use.

These systems operate within Google’s AI Hypercomputer architecture, which combines processors, networking and storage optimised for machine learning (ML) workloads.

Google Cloud CEO, Sundar Pichai

The A4X virtual machines deliver over one exaflop of computational capacity per rack, equivalent to one quintillion floating-point operations per second. These systems support scaling to tens of thousands of graphics processing units (GPUs) through Google’s Jupiter network fabric and Nvidia ConnectX-7 network interface cards.

Additionally, Google’s third-generation liquid cooling infrastructure maintains consistent performance for large-scale AI workloads by managing heat generation from high-performance processors. This cooling system enables sustained operation of dense computing configurations required for training and running large language models (LLMs).

Organisations can access these virtual machines through managed services including Vertex AI and Google Kubernetes Engine, enabling development and deployment of agentic AI applications.

Google Distributed Cloud enabling on-premises Gemini deployment

Google Distributed Cloud, the company’s fully managed solution for on-premises and air-gapped environments – environments operating without external network connections to maintain security isolation – will also support Nvidia Blackwell platforms to enable secure deployment of Gemini models within customer data centres.

This capability addresses requirements from public sector, healthcare and financial services organisations that must comply with data residency regulations or maintain strict security controls – since these sectors often cannot use cloud-based AI services due to regulatory constraints on data location and access.

Meanwhile, Nvidia Blackwell’s confidential computing capabilities protect user prompts and model fine-tuning data during processing – as confidential computing uses hardware-based security features to encrypt data while it remains in use – preventing unauthorised access even by system administrators or cloud providers.

Regarding the on-premises deployment option, it expands access to Google’s Gemini models for organisations that previously could not use cloud-based AI services due to compliance requirements. This enables these customers to implement agentic AI applications whilst maintaining control over their data and meeting privacy standards.

Performance optimisation targeting Gemini and Gemma models

The Gemini family of models is Google’s approach to multimodal AI systems that can process text, images and other data types within a single model architecture. These models demonstrate capabilities in complex reasoning, code generation and understanding relationships between different types of information.

Google Gemini AI is a multimodal, multilingual LLM

As a result, Nvidia and Google have implemented performance optimisations to ensure Gemini inference workloads operate efficiently on Nvidia GPUs, particularly within Google Cloud’s Vertex AI platform. These optimisations enable Google to serve significant volumes of user queries for Gemini models on Nvidia-accelerated infrastructure across Vertex AI and Google Distributed Cloud.

The Gemma family of lightweight, open models has now been optimised for inference using Nvidia’s TensorRT-LLM library. TensorRT-LLM accelerates large language model inference by optimising neural network operations for Nvidia hardware.

These models are expected to become available as Nvidia NIM microservices, which package AI models as containerised applications for simplified deployment.

Therefore, these optimisations enable deployment across various architectures, from data centres to local Nvidia RTX-powered personal computers and workstations – allowing developers to run AI workloads on infrastructure that matches their performance requirements and deployment constraints.

Explore the latest edition of AI Magazine and be part of the conversation at our global conference series, Tech & AI LIVE.

Discover all our upcoming events and secure your tickets today.

Also sign up to our free weekly newsletter for the latest insights and stories straight into your inbox.

AI Magazine is a BizClik brand

Nvidia & Google Cloud: Using Gemini AI for Regulated Sectors

The capabilities of Nvidia and Google Cloud combined

Google Cloud launching Blackwell-powered virtual machines

Google Distributed Cloud enabling on-premises Gemini deployment

Performance optimisation targeting Gemini and Gemma models

Company portals

Google Cloud

NVIDIA

Tags