How SambaNova and Intel are Scaling Inference for Agentic AI

Share this article
Share this article
Prioritise Us on Google
Rodrigo Liang, Co-Founder & CEO at SambaNova Systems
SambaNova and Intel have come up with a blueprint to deliver premium inference designed for the agentic age, powered by Intel Xeon 6 CPUs & SambaNova RDUs

On the back of ballooning adoption of AI agents and their incessant draining of data centre capacity, a new infrastructure playbook is beginning to take shape. 

At the epicentre of this shift are tech giants SambaNova and Intel, as they unveil a heterogeneous inference blueprint designed to meet the growing demands of energy as enterprises power their agents.

Aiming for a balanced and efficient system built for scale, this architecture assigns specific roles to different compute types. 

In this system of divided labour, GPUs handle the prefill phase, SambaNova RDUs take on high-speed decoding and Intel Xeon 6 CPUs orchestrate tasks while executing agent-driven workloads.

Intel and SambaNova announced a new blueprint designed for agentic AI workloads | Credit: Intel

Agentic AI is moving into production – and the winning pattern we’re seeing is GPUs to start the job, Intel Xeon 6 to run it and SambaNova RDUs to finish it fast,” says Rodrigo Liang, CEO and Co-Founder of SambaNova Systems. 

“Together with Intel, we’re giving customers a blueprint they can deploy in existing air-cooled data centres, with broad x86 coverage for the coding agents and tools they already use today.”

Enterprise availability is expected in the second half of 2026.

Why GPU-only stacks are no longer enough

For years, GPUs have dominated AI infrastructure. However, the rise of agentic AI is exposing their limits. 

Youtube Placeholder

Today’s coding agents do far more than generate text. They write and compile code, interact with APIs, query databases and manage workflows in real time.

This growing complexity means every stage of the pipeline matters. Prefill may still belong to GPUs, but decoding speed and task execution are increasingly defining overall system performance.

“The data center software ecosystem is built on x86 and it runs on Xeon – providing a mature, proven foundation that developers, enterprises and cloud providers rely on at scale,” says Kevork Kechichian, Executive Vice President and General Manager of the Data Center Group at Intel Corporation. 

Kevork Kechichian, Executive Vice President and General Manager of the Data Center Group at Intel Corporation

“Workloads of the future will require a heterogeneous mix of computing and this collaboration with SambaNova delivers a cost-efficient, high-performance inference architecture designed to meet customer needs at scale – powered by Xeon 6.”

The message is clear. No single chip can handle every stage of an agentic workflow efficiently. Instead, the future lies in combining strengths across different architectures.

Inside the Xeon 6 and RDU advantage

The real innovation in this blueprint sits in how the components work together. 

Meanwhile, Intel Xeon 6 processors take on a dual role. They act as the system’s control plane while also serving as the execution layer for agent tasks, from compiling code to validating outputs.

Performance of the Intel-SambaNova design | Credit: SambaNova

Performance gains back up this new design. 

SambaNova’s measurements show that Xeon 6 delivers more than 50% faster LLVM compilation times compared to Arm‑based server CPUs and a whopping 70% faster vector database performance available on other x86‑based systems.

For developers, this means faster iteration and smoother paths from idea to deployment.

SambaNova’s RDUs, including the SN50, are designed to deliver high-throughput, low-latency decoding for large language models. This ensures rapid token generation once the system is in motion.

Harry Ault, CRO at SambaNova Systems

“When thousands of simultaneous coding agents are generating tool calls, retrieval requests, code builds and encrypted inter-agent messages, the CPU is not a background component – it is the system’s executive and action layer,” says Harry Ault, CRO of SambaNova.

“Pairing Xeon 6 with SambaNova SN40 and SN50 RDUs gives enterprises and sovereign AI operators deployments that are faster, more cost-efficient and purpose-built for the agentic workloads that are running in production today.”

As agentic AI continues to gain traction across industries, this kind of balanced architecture is likely to become the norm rather than the exception. 

With its planned rollout in H2 2026, the SambaNova and Intel collaboration could well define how the next generation of AI systems is built and scaled.

Company portals

Executives