The Results of Anthropic’s Claude AI Chrome Extension Pilot

By Kitty Wheeler

August 27, 2025

undefined mins

Share this article

Prioritise Us on Google

Share this article

Prioritise Us on Google

Anthropic launches a Chrome browser extension pilot to test Claude’s browser agent

Anthropic launches a pilot of a Chrome browser extension with Claude that allows users to automate tasks, but consequently faces cyber challenges

Anthropic, the AI company behind the Claude chatbot, is beginning to test a Chrome browser extension that allows its AI assistant to take actions directly within web browsers.

This is a unique development that could change how people interact with AI tools.

So far, the company is running a controlled pilot with 1,000 users on its Max subscription tier. The extension lets people ask Claude to perform tasks like clicking buttons, filling out forms and managing calendar appointments without switching between applications.

It’s a logical next step for Anthropic, which has spent recent months connecting Claude to calendars, documents and other software – but browser-based AI brings fresh challenges that the company is still working to solve.

The security risks emerging in early testing

The pilot is revealing some troubling vulnerabilities already.

Browser-using AI systems are susceptible to what researchers call prompt injection attacks – essentially digital tricks where bad actors hide malicious instructions in websites or emails to manipulate AI behaviour without users realising it.

For Anthropic, the company ran 123 different attack scenarios and found that without proper safeguards, malicious actors could successfully manipulate Claude 23.6% of the time.

One particularly concerning example involved a fake email that appeared to come from an employer, instructing that emails needed to be deleted for “security reasons.”

Claude dutifully followed these hidden instructions and began deleting the user’s messages without asking for confirmation.

Dario Amodei, Anthropic’s CEO

The implications then extended beyond deleted emails: “Prompt injection attacks can cause AIs to delete files, steal data or make financial transactions,” Anthropic warns in its technical documentation.

Yet these aren’t theoretical concerns. Anthropic’s researchers deliberately tested these attack vectors and found them surprisingly effective against an unprotected system.

Defensive measures showing promise but gaps remain

In response, Anthropic has developed several layers of protection.

One layer that users can maintain control through site-level permissions, allowing them to grant or revoke Claude’s access to specific websites.

The system also asks for confirmation before taking risky actions like making purchases or sharing personal information.

The company has blocked Claude from accessing certain high-risk categories entirely – financial services, adult content and pirated material are all off-limits.

Behind the scenes, Anthropic is building classifiers designed to spot suspicious patterns in instructions, even when they appear within seemingly legitimate contexts.

These measures have helped. The attack success rate dropped from 23.6% to 11.2% when Anthropic deployed its full suite of protections.

That’s better than the company’s existing “Computer Use” feature, which can see users’ screens but lacks the browser-specific safeguards being introduced here.

Prompt injection attack success rates across three scenarios: Anthropic's older computer use capability, new browser use product with only previous safety mitigations and new browser use product with new mitigations (lower scores are better) / Credit: Anthropic

For attacks specifically designed to exploit browser vulnerabilities – like hidden form fields that humans can’t see or malicious instructions embedded in URLs – Anthropic’s new defences proved more effective, reducing success rates from 35.7% to zero across four different attack types.

Real-world testing revealing new challenges

Despite the company’s advancements, Anthropic acknowledges that controlled laboratory testing only goes so far.

Real users browse differently than researchers do, visiting different sites and making different requests. Meanwhile, malicious actors continue developing new attack methods.

“Internal testing can’t replicate the full complexity of how people browse in the real world: the specific requests they make, the websites they visit and how malicious content appears in practice,” the company says.

That’s where the pilot comes in. Anthropic is looking for what it calls “trusted testers” – people comfortable with Claude taking actions on their behalf, but who don’t have setups that handle sensitive or safety-critical work.

The company is now being cautious about who gets access and how they use it.

Users can join a waitlist at claude.ai/chrome and Anthropic recommends starting with familiar websites while avoiding anything involving financial, legal or medical information.

The feedback from this pilot will help Anthropic refine both its AI models and its safety systems.

By seeing how prompt injection attacks actually play out in practice – rather than in controlled tests – the company hopes to teach Claude to recognise and resist these manipulations more effectively.

“We hope that you’ll share your feedback to help us continue to improve both the capabilities and safeguards for Claude for Chrome – and help us take an important step towards a fundamentally new way to integrate AI into our lives,” Anthropic says.

The Results of Anthropic’s Claude AI Chrome Extension Pilot

The security risks emerging in early testing

Defensive measures showing promise but gaps remain

Real-world testing revealing new challenges

Tags