2 Min Read

Cloudflare Accuses Perplexity of "Stealth Crawling" Blocked Sites

Featured Image

Serious about learning how to use AI? Sign up for our AI Mastery Membership.

LEARN MORE

The fight over how AI companies access online content just escalated.

Cloudflare says AI search startup Perplexity has been disguising its web crawlers to sidestep restrictions, a practice known as “stealth crawling.” In a detailed report, the internet infrastructure giant claims Perplexity’s bots change their identity and rotate IP addresses to get around blocks.

According to Cloudflare, this behavior isn’t rare. The company says it’s been observed across tens of thousands of domains, amounting to millions of requests per day.

Perplexity isn’t taking the accusation quietly. In a rebuttal, the startup denied intentional wrongdoing, called the report a “publicity stunt,” and insisted Cloudflare conflated legitimate user-driven requests with automated bot activity.

So what’s really going on, and why does it matter?

I got the scoop from Marketing AI Institute founder and CEO Paul Roetzer on Episode 161 of The Artificial Intelligence Show.

This Is About the Rules of the Web

At first glance, this might sound like a niche technical dispute. But it's really about whether or not AI companies respect the boundaries set by publishers and site owners.

That fear isn’t hypothetical. The New York Times’ lawsuit against OpenAI and Microsoft hinges on similar allegations: that companies bypassed protections to gather data. And in Perplexity’s case, there’s history. CEO Aravind Srinivas has previously spoken openly about accessing platforms against their terms of service.

"When you're on record saying you constantly do these kinds of things, it's really hard to have credibility when you come out saying, 'No, we're not doing anything wrong,'" says Roetzer.

Right now, however, Perplexity is arguing that its AI assistants aren’t “traditional” crawlers. Instead of systematically scraping and storing the web, they fetch specific pages in real time when a user asks a question, then discard them.

So, the company essentially says it's using AI agents to help users. Not to scrape content. You can expect more messiness around this topic, says Roetzer.

"We're going to have this very prolonged transitional phase where we start running into these kinds of issues," he says.

As AI agents make up more and more of website traffic, the lines between helping users and harming website owners may start blurring pretty fast.

The stakes for publishers are high. Block these AI-driven agents, and your content may vanish from the emerging chatbot and assistant economy. Allow them, and you risk losing control over how (and by whom) your work is consumed.

The Tip of the Spear

For Roetzer, the Perplexity–Cloudflare spat is just the beginning.

"It's just the tip of the spear," he says. "There's a lot more coming."

And, as AI systems become more embedded in daily online interactions, businesses will need people whose jobs are to navigate exactly these kinds of challenges.

If the past few years were about AI learning from the web, the next few will be about the web deciding how and whether it wants to cooperate. And in that battle, stealth crawlers may just be the first shots fired.

Related Posts

Big AI Tech Updates from OpenAI, Perplexity, Meta, Anthropic, and More

Mike Kaput | June 4, 2024

The AI arms race is heating up, with a flurry of major updates from key players this week.

Humane's AI Pin Gets Brutal Reviews, Udio's Music Generator Makes Waves, and Captions AI Creative Suite

Mike Kaput | April 16, 2024

Several hot AI startups are getting attention this week—and not all of it good.

How to Optimize Your Content with AI

Aki Balogh | October 23, 2018

You’ve probably heard a lot of hype around artificial intelligence. This post goes beyond the hype to show you practical ways you can actually use AI to optimize your content.