Google Lowers Barrier for AI Developers to Use Powerful GPUs

Share
With NVIDIA L4 GPUs, developers can now build and deploy AI models that can handle complex tasks swiftly
Google Cloud Run has announced its users will be able to support NVIDIA L4 GPU to enhance their AI inference capabilities

Google has announced a significant update to its Cloud Run service, introducing support for NVIDIA L4 GPUs for users that could dramatically enhance their AI offerings.

This enhancement is designed to make advanced AI capabilities more accessible for a variety of applications, potentially transforming the functionality of everyday software.

“With the addition of NVIDIA L4 Tensor GPU and NVIDIA NIM support, Cloud Run provides users a real-time, fast-scaling AI inference platform to help customers accelerate their AI projects and get their solutions to market faster — with minimal infrastructure management overhead.” says Anne Hecht, Senior Director of Product Marketing, NVIDIA

The new feature enables developers to attach one NVIDIA L4 GPU, equipped with 24GB of vRAM, to their Cloud Run instances on an as-needed basis. 

What’s on offer?

This announcement allows for substantial computational power without the burden of maintaining a constant and costly GPU infrastructure, crucial for enabling wider adoption of stronger AI systems. 

Furthermore, the service's ability to scale down to zero during periods of inactivity ensures that users are not charged when the service is not in use, providing notable cost savings.

Yet, a key focus of this update is its speed, enabling real-time inference applications. These are AI systems capable of processing and responding to input data with minimal delay, often within milliseconds or seconds. 

Youtube Placeholder

Real-time inference is crucial for applications that require immediate responses, such as custom chatbots, on-the-fly document summarisation, instant image recognition, and real-time video processing.

This is facilitated by the GPU which supports enhanced compute-intensive tasks, including on-demand image recognition, video transcoding, and 3D rendering.

Building stronger AI services

By leveraging the power of NVIDIA L4 GPUs, developers can now build and deploy AI models that can handle complex tasks swiftly, enhancing user experience and enabling new types of interactive AI-powered services that were previously impractical due to performance limitations.

Developers can use the LLMs of their choice - like open models with up to 9 billion such as Google's Gemma (2B/7B) or Meta's Llama 3 (8B) - with fast token rates.

Additionally, businesses can serve custom fine-tuned Gen AI models, such as tailored image generation, while optimising costs by scaling down when demand decreases. 

NVIDIA L4 GPU

Instances with an attached L4 GPU can start in approximately five seconds, allowing processes within the container to begin utilising the GPU almost immediately. 

Cold-start times for various models, such as Gemma 2b and Llama 3.1, range from 11 to 35 seconds, depending on the specific model and its size. 

Impact on user experience 

Already, early adopters of this technology have expressed enthusiasm regarding its impact on their AI operations. 

“Cloud Run's GPU support has been a game-changer for our real-time inference applications,” says Thomas MENARD, Head of AI - Global Beauty Tech, L’Oreal. “Overall, Cloud Run GPUs have significantly enhanced our ability to provide fast, accurate, and efficient results to our end users.”

Currently, Cloud Run GPUs are available in the us-central1 region, with plans for availability in Europe and Asia by the end of the year. 

Google's  Cloud Run update significantly lowers the barrier for developers to access advanced AI capabilities. As developers can now scale resources on-demand and pay only for usage, enterprises are able to offer their customers better, AI-enabled services  across various sectors, potentially streamlining their internal operations and enhancing user experiences. 

******

Make sure you check out the latest edition of AI Magazine and also sign up to our global conference series - Tech & AI LIVE 2024

******

AI Magazine is a BizClik brand

Share

Featured Articles

Harnessing AI to Propel 6G: Huawei's Connectivity Vision

Huawei Wireless CTO Dr. Wen Tong explained how in order to embrace 6G to its full capabilities, operators must implement AI

Pegasus Airlines Tech Push Yields In-Flight AI Announcements

Pegasus Airlines has developed its in-house capabilities via its Silicon Valley Innovation Lab to offer multilingual AI announcements to its passengers

Newsom Says No: California Governor Blocks Divisive AI Bill

California's Governor Gavin Newsom blocked the AI Bill that divided Silicon Valley due to lack of distinction between risks with model development

Automate and Innovate: Ayming Reveals Enterpise AI Use Areas

AI Strategy

STX Next AI Lead on Risk of Employing AI Without a Strategy

AI Strategy

Huawei Unveils Strategy To Lead Solutions for Enterprise AI

AI Strategy