Social media platform Reddit has launched legal proceedings against artificial intelligence company Anthropic, accusing the firm of unlawfully scraping user-generated content to train its AI assistant, Claude.
Filed on Wednesday in the California Superior Court in San Francisco, the lawsuit alleges that Anthropic used automated tools to extract Reddit users’ comments without permission, despite being explicitly told not to do so. According to Reddit, this content was then used to train the Claude chatbot, without user consent or proper licensing.
“AI companies should not be allowed to scrape information and content from people without clear limitations on how they can use that data,” said Reddit’s Chief Legal Officer, Ben Lee. He emphasised the platform’s commitment to safeguarding its community, which generates vast volumes of public discourse each day.
Reddit, which went public last year, has previously struck licensing deals with AI developers such as OpenAI and Google. These arrangements, the company says, allow for transparent and lawful use of data while enabling user protections like content deletion and spam prevention.
“These partnerships allow us to enforce meaningful safeguards for our users,” Lee added, underlining the contrast between authorised data access and Anthropic’s alleged misconduct.
Anthropic, founded by ex-OpenAI executives in 2021 and now backed heavily by Amazon, denies the allegations. “We disagree with Reddit’s claims and will defend ourselves vigorously,” the company said in a brief statement.
The legal complaint takes a different route from other recent cases involving AI firms. Rather than alleging copyright infringement, as seen in lawsuits from music publishers targeting Claude for replicating song lyrics, Reddit’s filing focuses on breach of contract and unfair competition. Specifically, it accuses Anthropic of violating Reddit’s terms of service by scraping data without a proper license.
The lawsuit cites a 2021 research paper co-authored by Anthropic CEO Dario Amodei, in which Reddit’s subforums were explicitly mentioned as high-quality sources for training AI systems. Forums covering topics like gardening, history, and personal advice were listed as especially valuable for modelling human language patterns.
Anthropic has previously claimed, including in a 2023 letter to the US Copyright Office, that its methods of training Claude constitute lawful use, describing them as a form of statistical analysis rather than reproduction of content.