Why Reddit Sues Anthropic: The Dangers of AI Data & Privacy

Share this article
Share this article
Prioritise Us on Google
Reddit is suing Anthropic
Reddit files a lawsuit against Anthropic, creator of Claude AI, alleging unauthorised scraping of over 100,000 user posts and comments to train its LLMs

Reddit has filed a lawsuit against Anthropic, the AI company behind the Claude chatbot, alleging unauthorised scraping of user content for AI model training.

The case, filed in California state court, centres on claims that Anthropic made more than 100,000 unauthorised requests to Reddit’s servers to collect user posts, comments and other content without permission.

The social media company alleges this occurred despite Anthropic’s public statements that it had ceased such practices.

The impact of Anthropic bypassing Reddit’s technical protections

Reddit has established licensing agreements with major technology firms including OpenAI, the creator of ChatGPT and Google’s parent Alphabet – these structured deals including provisions for content usage, privacy safeguards and data deletion procedures.

However, according to Reuters, the lawsuit alleges that Anthropic circumvented Reddit’s robots.txt file, a standard web protocol that instructs automated systems which parts of a website they should not access.

Reddit CEO, Steve Huffman

This technical measure serves as a digital “no trespassing” sign for web crawlers and scraping bots.

Reddit claims Anthropic ignored these restrictions and violated the platform’s terms of service by collecting user content without authorisation. The complaint includes allegations that Anthropic gathered deleted posts, raising concerns about user privacy and data retention practices.

Reddit further alleges that Anthropic declined to pursue a formal licensing agreement, instead opting to scrape content directly from the platform. This approach, Reddit claims, allowed Anthropic to avoid licensing fees whilst bypassing user protection measures.

Anthropic CEO, Dario Amodei

The lawsuit references a 2021 research paper co-authored by Anthropic CEO, Dario Amodei, which identified Reddit as a valuable source of training data for language models. 

Dario co-founded Anthropic after serving as Vice President of research at OpenAI.

The consequences of inadequate AI safeguarding 

Reddit presents evidence that Claude reproduced Reddit posts with near-perfect accuracy, including content that users had subsequently deleted from the platform.

The social media company argues this demonstrates Anthropic’s failure to implement adequate safeguards for user privacy and content removal requests.

This means that Reddit contends that Anthropic’s actions violated principles of fair competition by accessing Reddit’s data without compensation, while competitors paid licensing fees for similar access.

“For its part, despite what its marketing material says, Anthropic does not care about Reddit’s rules or users,” the lawsuit states, according to AI News.

“It believes it is entitled to take whatever content it wants and use that content however it desires, with impunity.”

As a result, Reddit seeks financial damages and a court injunction preventing Anthropic from using Reddit content in future model training or development.

Key facts:
  • Reddit is suing Anthropic for allegedly scraping over 100,000 user posts and comments without permission to train its AI models
  • Reddit claims Anthropic bypassed technical protections, violated terms of service, and refused to enter a licensing agreement
  • The lawsuit highlights broader industry tensions over data rights, user privacy and ethical AI development

The company argues that unauthorised scraping undermines its business model and violates user trust.

Anthropic’s history of copyright challenges

This legal action is the latest challenge to Anthropic’s data collection practices, as in August 2024, a group of authors filed a class-action lawsuit claiming Anthropic used copyrighted books without permission to train its AI models.

The authors sought compensation for the unauthorised use of their written works.

Additionally, Universal Music Group and other music publishers filed a separate lawsuit in October 2023, alleging that Claude reproduced copyrighted song lyrics without authorisation.

The music companies claimed this violated their intellectual property rights and requested court orders blocking further use of their content.

Yet Reddit’s case differs from previous copyright-focused lawsuits by emphasising breach of contract and unfair competition rather than intellectual property violations.

The platform argues that user-generated content on its site remains subject to terms of service that Anthropic knowingly violated.

An Anthropic spokesperson says the company disagrees with Reddit’s claims and intends to defend against the lawsuit.

The AI company has not provided detailed responses to the specific allegations.

Broader implications of dangers in the AI sector

The legal dispute reflects broader tensions within the AI industry over training data acquisition.

Youtube Placeholder

As AI companies require increasingly large datasets to develop competitive models, conflicts over data rights and usage permissions have become more frequent.

Web scraping - the automated extraction of data from websites - operates in a legal grey area.

While publicly available information can often be accessed, terms of service agreements and technical restrictions can establish legal boundaries for data collection.

The lawsuit highlights contradictions between Anthropic’s public commitments to ethical AI development and its alleged data collection practices.

Reddit claims these inconsistencies mislead users and competitors about the company’s actual approach to data acquisition.


Explore the latest edition of AI Magazine and be part of the conversation at our global conference series, Tech & AI LIVE

Discover all our upcoming events and secure your tickets today.

Also sign up to our free weekly newsletter for the latest insights and stories straight into your inbox.


AI Magazine is a BizClik brand