Reddit Sues Perplexity for ‘Industrial-Scale’ Data Scraping
TL;DR: Reddit is suing Perplexity and three data scraping service providers—SerpApi, Oxylabs, and AWMProxy—alleging they unlawfully circumvent data protections to obtain Reddit content. The lawsuit claims Perplexity buys scraped data rather than entering a licensing agreement like competitors OpenAI and Google.
Reddit has filed a lawsuit to stop what it calls “industrial-scale, unlawful circumvention of data protections” by Perplexity and three data scraping companies. The complaint alleges that whilst some AI companies have entered licensing agreements with Reddit, Perplexity instead purchases scraped data from third-party providers who bypass Reddit’s protections.
The Data Laundering Economy
Ben Lee, Reddit’s chief legal officer, describes the situation as an “industrial-scale ‘data laundering’ economy” driven by AI companies’ competition for quality human content. “Scrapers bypass technological protections to steal data, then sell it to clients hungry for training material,” Lee states. “Reddit is a prime target because it’s one of the largest and most dynamic collections of human conversation ever created.”
The lawsuit compares the data scraping companies to “would-be bank robbers” who, “knowing they cannot get into the bank vault, break into the armoured truck carrying the cash instead.” Reddit alleges that SerpApi, Oxylabs, and AWMProxy mask their identities and disguise their web scrapers to steal Reddit content from Google Search results.
Evidence of Continued Scraping
Reddit sent a cease-and-desist letter to Perplexity in May 2024 demanding it stop scraping Reddit data. Perplexity responded that it didn’t use Reddit content to train AI models and would respect Reddit’s robots.txt file. However, the volume of Reddit citations on Perplexity actually increased after that letter.
To test Perplexity’s claims, Reddit created a post that could only be crawled by Google. “Within hours,” Reddit alleges, Perplexity “produced the contents” of that post. “The only way that Perplexity could have obtained that Reddit content is if it and/or its co-defendants scraped Google search results for that Reddit content,” the complaint states.
Market Context
Reddit’s data—posts on diverse topics written and ranked by humans—is highly valuable for training AI models. The company’s controversial 2023 API changes that sparked widespread protests were positioned as a way to be compensated for that data. Reddit has struck licensing deals with OpenAI and Google, and reportedly seeks improved terms with existing partners. The company previously took legal action against Anthropic for similar scraping allegations.
Looking Forward
Jesse Dwyer, Perplexity’s head of communication, responded: “Perplexity has not yet received the lawsuit, but we will always fight vigorously for users’ rights to freely and fairly access public knowledge. Our approach remains principled and responsible as we provide factual answers with accurate AI.”
The case highlights the tension between AI companies seeking training data and platforms attempting to monetise their user-generated content through licensing agreements rather than allowing unrestricted scraping.
Source Attribution:
- Source: The Verge
- Original: https://www.theverge.com/news/804660/reddit-suing-perplexity-data-scrapers-ai-lawsuit
- Published: 22 October 2025
- Author: Jay Peters