Why Enterprises are Moving Critical AI Workloads On-Premise

Share this article
Enterprises are increasingly moving AI workloads on-premises to address rising cloud costs, latency, and data sovereignty regulations. Credit: Getty Images
From financial services to industrial manufacturing, soaring cloud costs and strict data sovereignty laws make the case for private AI infrastructure grow

For the better part of a decade, the dominant narrative in enterprise computing was straightforward: move everything to the cloud. 

Public cloud providers offered scale, flexibility and an escape from costly hardware refresh cycles. Organisations across every sector followed the path with enthusiasm. 

Now, quietly but with gathering momentum, many of those same organisations are beginning to rethink the map.

The driver of this shift is AI. As AI has moved from experimental pilots to mission-critical infrastructure, it has exposed the limitations of a cloud-only strategy. Latency, data sovereignty, regulatory compliance and, increasingly, costs are pushing enterprises to bring AI workloads back behind their own walls. 

On-premises AI, purpose-built private infrastructure designed to run large-scale model training and inference within an organisation's own data centre, is no longer a niche concern. It is fast becoming a central pillar of enterprise technology strategy.

The numbers are striking. IDC reports that enterprise spending on compute and storage hardware for AI grew by 166% year-on-year in the second quarter of 2025. Gartner estimates that worldwide AI spending reached US$1.5tn in 2025, with data centre systems spend rising nearly 47% in that year alone. 

Youtube Placeholder

The global GPU server market, valued at US$171bn in 2025, is forecast to reach US$730bn by 2030. These are not marginal figures. They represent a fundamental re-engineering of how digital infrastructure is conceived, procured and operated.

The economics are straightforward once you understand the workload profile of modern AI. Training a large language model or running continuous inference at scale in a public cloud quickly becomes prohibitively expensive.

For organisations processing sensitive customer data, operating in regulated industries, or working under data-residency laws such as the EU AI Act or GDPR, the cloud often simply cannot be the answer.

Legal firms, financial institutions, healthcare providers and defence-adjacent organisations have specific confidentiality obligations that, in many jurisdictions, legally require on-premises deployment. The cloud is not always an option, it is sometimes a risk.

The technology has caught up with the ambition. A new generation of on-premises AI infrastructure, liquid-cooled GPU servers, high-bandwidth storage systems, intelligent power management and purpose-built networking, has made it possible to deploy hyperscale-equivalent compute within a corporate data centre. 

NVIDIA's Blackwell architecture, available through OEM partners such as Dell, HPE and Lenovo, delivers petaflop-scale inference performance in rack-mounted systems that organisations can own, operate and secure independently. The hardware that once required the resource base of a hyperscaler is now available to any well-capitalised enterprise.

Hybrid infrastructure is becoming the de facto enterprise model. Rather than a binary choice between cloud and on-premises, most organisations are converging on a layered approach: public cloud for elastic, non-sensitive workloads; colocation or private data centres for AI inference and model fine-tuning; and edge deployments for latency-critical or operationally isolated environments. 

Surveys of AI-adopting organisations consistently show a clear movement from cloud-only towards this more complex, distributed architecture.

Modern on-premises AI infrastructure requires advanced liquid cooling to manage the intense power density of GPU racks. Credit: Getty Images

The implications for data centre design are profound. On-premises AI is not simply a matter of installing more servers. It demands a rethink of power density, cooling architecture, physical security and network topology. 

AI rack densities can reach 100 kilowatts per rack and above, a figure that renders traditional air cooling inadequate and requires purpose-built liquid cooling infrastructure. Power resilience, grid connectivity and thermal management are now as strategically important as software licensing. The data centre is becoming an AI factory, and engineering it correctly is a competitive differentiator.

Security, too, is being reimagined. As AI workloads handle increasingly sensitive data and as AI-driven adversaries raise the threat level across the board, organisations running on-premises AI must embed cyber resilience into the physical and logical design of their infrastructure from the outset. The separation of AI training data, model weights and inference outputs, and the protection of each, requires new security architectures that span both hardware and software.

Goldman Sachs deploys private AI infrastructure to securely power autonomous engineering agents and boost developer productivity

Goldman Sachs: the agentic bank

Goldman Sachs has emerged as one of the most advanced examples of enterprise-scale, on-premises AI deployment in the financial sector. The firm's approach is neither tentative nor experimental. 

Having stepped back from its consumer banking ambitions through the Marcus brand, Goldman has redirected that capital and strategic focus into building what its CIO Marco Argenti describes as a 'hybrid workforce', an operating model in which AI agents work alongside human employees as genuine contributors to productivity.

At the technical centre of this transformation sits a private AI infrastructure stack. Goldman's data centre estate now runs a suite of agentic AI tools, including the GS AI Assistant and the Louisa internal networking platform. 

The firm became the first major financial institution to deploy Devin, an autonomous software engineering agent from Cognition, across its 12,000-strong developer workforce. 

The results are significant: while earlier code-assistance tools delivered approximately 20% efficiency gains, Goldman reports that its agentic AI deployment has driven productivity improvements of three to four times in software lifecycle management.

The firm's rationale for on-premises deployment is grounded in regulatory necessity and competitive sensitivity. Goldman processes vast quantities of proprietary trading data, client information and market intelligence that cannot, by legal and competitive logic, reside in a third-party cloud environment.

The firm has been explicit that its AI infrastructure investments represent a reallocation of capital previously committed to consumer portfolios, effectively monetising its retreat from retail banking as a fuel for infrastructure modernisation.

Siemens deploys secure, on-premises AI and modular data centres to optimise industrial manufacturing without risking intellectual property

Siemens: industrial AI at the edge and on-premises

Siemens represents perhaps the most strategically sophisticated example of on-premises AI deployment in the industrial sector. 

With revenues of €75.9bn in fiscal 2024 and operations spanning discrete manufacturing, process industries, infrastructure and mobility, the company has both the scale and the imperative to treat on-premises AI as a core engineering challenge rather than a technology experiment.

The company's Industrial Copilot ecosystem, showcased at CES 2025 and the Siemens AI with Purpose Summit, deploys AI directly onto the factory floor through its Industrial Edge platform, bringing large language model capabilities into production environments where latency, security and operational continuity make cloud dependency unacceptable.

Siemens has also built on-premises AI into its EDA software portfolio for semiconductor and PCB design, offering customers the ability to deploy its AI system entirely within their own secure data centres, with enterprise-grade access controls and a multimodal data lake that improves model performance over time without exposing intellectual property externally.

The company partners with NVIDIA, deploying NIM microservices and Llama Nemotron reasoning models, while maintaining the flexibility for customers to run the full stack on private infrastructure. 

Siemens is also developing its own modular data centre product line, alongside Cadolto DataCenter GmbH, and Legrand Data Center Solutions, debuted at Data Center World Frankfurt in June 2025. The product line is designed to deliver AI-ready compute capacity in factory-built, rapidly deployable units with approximately 30% lower CO2 emissions than traditional facilities.

NTT DATA utilises agentic AI on private infrastructure to automate threat hunting and protect complex enterprise environments

NTT DATA: engineering cyber resilience into on-premises AI infrastructure

NTT DATA's global network of AI-powered Cyber Defense Centers represents a distinctive model for on-premises AI adoption: one in which the AI is not just deployed within private infrastructure, but is itself the mechanism for protecting that infrastructure. 

With facilities across India, the UK and the US, NTT DATA has purpose-built these centres to defend the cloud and AI-heavy environments that its clients operate on-premises and at the edge.

The architecture is built around agentic AI for security operations. Software agents autonomously triage, analyse and hunt for threats across customer environments, automating the high-volume, repetitive workflows that have historically consumed SOC analyst capacity. 

The system is designed to reduce alert volumes by up to 90% and cut investigation times by up to 60%, allowing human specialists to focus on complex forensic analysis, containment and post-incident response. This is AI running on private infrastructure, for the purpose of protecting private infrastructure.

The centres operate in close collaboration with regional Computer Emergency Response Teams, National Cyber Security Centres and government agencies, blending global threat intelligence with local regulatory context. 

For more than 1,200 clients served across NTT DATA's unified SecOps platform, the model provides 24x7 detection, response and incident management through a single dashboard, backed by a global network of more than 40 delivery centres and SOCs across over 50 countries.

NTT DATA's approach illustrates a broader truth about on-premises AI in the data centre: the technology is not simply a workload to be run, but a layer of active, intelligent protection that must be engineered into the infrastructure itself.