Article

Technology

Scaling the Challenges of Gen AI in the Cloud

By Kristian McCann

September 09, 2024

undefined mins

Share this article

Prioritise Us on Google

Share this article

Prioritise Us on Google

Gen AI in the cloud has ushered in new applications for business, but as deployments grow, cost and scalability are hampering broader implementation

Have you ever been at home and a question came into your head that you had to have answered, and so you turned to ChatGPT to explain it to you? Or perhaps you used DALL-E to reimagine what your cat, sitting on your sofa, would look like if dressed in a wizard costume? Then, unless you happen to inhabit a data centre, you have the cloud to thank for that.

Yes, this whole generative AI (Gen AI) obsessed world we currently occupy is all thanks to the cloud. Consider the case of large language models (LLMs), the powerhouse behind many Gen AI applications. These models require vast amounts of computational power and data storage.

The cloud lowered the barriers for adoption, melting away the upfront infrastructure costs, meaning startups like OpenAI could go on to usher in trailblazer systems like ChatGPT.

It also allows businesses to leverage the continuous innovation and development being done by cloud service providers (CSPs) and the open-source community, as well as the scalable infrastructure and on-demand resources to function as a launchpad for AI-driven transformation across industries.

Cloud platforms have, therefore, fundamentally altered the landscape of AI adoption, democratising access to developers and consumers alike.

Yet, as with any technological advancement, the marriage of cloud and AI brings its own set of challenges. “The challenge for many companies is how to scale up," says Paul Cardno, Global Digital Automation & Innovation Senior Manager at 3M.

Balancing Gen AI’s ambitions with its limitations

At the heart of the scaling challenge lies the sheer computational power required to run Gen AI models, particularly LLMs. As deployments scale, so do the associated costs.

"Complex deployment and ongoing consumption mean that these models are expensive to develop," Dr Chris Hillman, Data Science Senior Director International at Teradata notes.

Take for instance a Gen AI-powered customer service chatbot. Initially deployed for a single product line, a company could gradually extend the chatbot's capabilities across its entire catalogue, necessitating a significant increase in computational resources and knowledge base. As the system handles a growing volume of concurrent interactions, the computational power needed will increase and, therefore, the costs.

Gen AI’s most common uses

This cost factor is pushing businesses to carefully evaluate their Gen AI strategies, balancing the potential benefits against the substantial resource requirements.

"It is important to understand the full implications across all dimensions; from publicity to infrastructure to return on investment (ROI) to what is done and needs to be improved for use cases like this to be a reality,” explains Dr Chris.

Such complex deployments and ongoing consumption mean that these models are expensive to develop. Moreover, the cloud's pay-as-you-go model, while offering flexibility, can lead to unexpected costs if AI workloads are not carefully managed.

The growing cloud costs as deployments scale may mean that in the rush to implement Gen AI across their offerings, organisations should really examine what value this implementation in this specific instance is bringing.

“Cost will continue to be an issue and the ability to accurately assess the ROI will be vital if Gen AI projects are to be deployed in production,” Dr Chris warns.

As models become more complex and data-hungry, organisations will need to get a balance between leveraging the cloud's capabilities and optimising resource utilisation.

Scaling, the Gen AI problem

Despite these hurdles, the momentum behind cloud-powered AI shows no signs of slowing. Innovations in cloud infrastructure are keeping pace with the evolving needs of AI systems.

From specialised hardware accelerators to advanced data management tools, cloud providers are continuously expanding their offerings to support the next generation of AI applications.

Scaling in terms of computational power could therefore see solutions coming in the near future.

Yet Paul warns of pitfalls in the interim: “There is a fast-growing economy of companies that will manage the complexity of building your models for you, but with those, there is a real risk of vendor lock-in by buying into their proprietary ecosystems.”

However, exploring partnerships with cloud providers could allow you to keep control of your model whilst also keeping a cap on costs.

To do this, Dr Chris explains: “It will be important to understand how much of the resource is needed on-demand and how much will be required to be permanently available.”

Understanding the distinction between on-demand and permanent resource needs is crucial for organisations deploying Gen AI in the cloud, as it directly impacts both scaling capabilities and cost management.

By clearly defining these requirements, customers and cloud providers can forge more effective agreements that optimise resource allocation and pricing structures. For instance, a company might negotiate a contract that includes a base level of permanent resources at a fixed cost, coupled with flexible pricing for on-demand resources that can be rapidly scaled up or down.

This approach allows for cost-effective handling of fluctuating workloads, such as sudden spikes in AI model usage during marketing campaigns or seasonal peaks.

Cloud’s capabilities going forward

Looking ahead, the symbiosis between cloud computing and Gen AI is likely to deepen further, as organisations across sectors increasingly look to integrate it into their operations.

This could no doubt increase more demand and more pressure on cloud providers, and that plus some supply shortages on the chip side, may make the process more expensive.

Yet, the journey of cloud-powered AI in the enterprise is far from over. Major cloud providers are racing to expand their infrastructure, with IT services companies like Coreweave investing £1bn (US$1.25bn) in data centre expansions, and new chip makers like Graphcore getting investments from banking titan Softbank to make Intelligence Processing Unit (IPU) hardware for better handling AI workloads.

This investment will not only help with costs and scalability, but the resulting capabilities can open up the availability of new AI models to be worked on in internal company processes.

“The “Agent” model looks like it will be an extraordinarily clever way of building AI-enabled, robust business processes,” Paul elaborates. “By creating a set of pluggable "Agents", that focus on executing specific tasks very well, businesses will be able to compose new processes that deliver faster, better results than we do today.”

As organisations continue to explore the possibilities, success will hinge on addressing current challenges head-on and embracing emerging solutions. The cloud has set the stage for an AI revolution in business - how enterprises navigate this new terrain will define the next era of digital transformation.

To read the full story in the magazine click HERE

******

Make sure you check out the latest edition of AI Magazine and also sign up to our global conference series - Tech & AI LIVE 2024