AWS Launches Tools to Accelerate Agent Deployment

AWS has launched several capabilities designed to simplify the process of training models for specific agent tasks, addressing infrastructure and expertise barriers that keep AI projects in prototype stages.
The tools announced at the company’s re:Invent 2025 conference automate training pipelines that currently require machine learning specialists and dedicated infrastructure teams. Reinforcement Fine Tuning delivers 66% accuracy improvements over base models according to AWS benchmarks, while Collinear AI reports cutting experimentation cycles from weeks to days using the new SageMaker AI capabilities.
Dr Swami Sivasubramanian, Vice President of Agentic AI at AWS, says: “Most companies opt for the largest, most capable models to power their agents, but a significant amount of an agent’s time is spent doing routine tasks, like checking calendars and searching documents, that don’t require advanced intelligence. The result? Unnecessary costs, slower responses, and wasted resources.”
Amazon Bedrock Reinforcement Fine Tuning delivers 66% accuracy gains
Reinforcement Fine Tuning in Amazon Bedrock handles model training without exposing the underlying infrastructure. Developers select a base model, provide invocation logs or a dataset and choose a reward function. The service runs the training pipeline and returns a customised model. At launch, RFT supports Amazon Nova 2 Lite, with additional models planned.
Phil Mui, SVP of Software Engineering, Agentforce at Salesforce, says AWS benchmarking shows up to 73% improvement over base models in accuracy. “We anticipate leveraging RFT to enhance and extend what we already achieve with supervised fine-tuning, enabling us to deliver even more precise and customised AI solutions for our customers,” he says.
The process focuses on quality over volume. “Fine-tuning your AI model is like turning a general assistant into a specialist, like turning a family doctor into a cardiologist who is laser-focused on exactly what you need them to know,” Swami says. “A dataset with 10,000 carefully curated agent interactions can outperform millions of generic examples.”
Amazon Bedrock AgentCore introduces Policy and Evaluations
Policy in Amazon Bedrock AgentCore allows teams to set boundaries on agent behaviour using natural language instructions. The system specifies which APIs, Lambda functions or third-party services like Salesforce and Slack an agent can access and under what conditions.
The feature addresses cases where agents might attempt unauthorised actions. A policy statement such as “Block all refunds from customers when the reimbursement amount is greater than $1,000”, for example, prevents agents from processing high-value refunds without human approval.
AgentCore Evaluations, meanwhile, provides 13 pre-built evaluators covering correctness, helpfulness, tool selection accuracy, safety, goal success rate and context relevance. The service samples live agent interactions and can trigger alerts based on performance thresholds, such as sending an alert automatically if a customer service agent’s satisfaction scores drop 10% over eight hours.
Amazon Bedrock AgentCore Memory enables episodic learning
Episodic functionality in AgentCore Memory stores interaction experiences as discrete episodes containing context, reasoning, actions and outcomes. Another agent analyses these episodes to identify patterns that inform future decisions. When an agent encounters similar tasks, it retrieves relevant historical data rather than processing from scratch.
“Think about the experience at your favourite local restaurant. What makes it exceptional? It’s because the staff know your name. They know your seating preferences. They know your favourite dish,” Swami says. “In the agentic world, they have short-term memory for handling the immediate conversation and long-term memory that persists across sessions.”
S&P Global Market Intelligence deployed the capability across a distributed agent platform. Helene Astier, head of Technology, MI Enterprise Technology and Sustainability at S&P Global Market Intelligence, says the company built Astra, an internal agentic workflow platform, but faced orchestration challenges. “As hundreds of specialised agents emerged, managing state and maintaining consistent context became increasingly difficult, highlighting the need for a unified memory layer,” she says.



