Cloudflare on Monday announced plans to launch a marketplace sometime next year where website owners can sell access to AI model providers to scrape their site's content. The marketplace is the latest step in Cloudflare CEO Matthew Prince's larger plan to give publishers more control over when and how their AI bots scrape their websites.
“If you don't compensate creators in some way, they're going to stop creating, and that's what needs to be solved,” Prince said in an interview with TechCrunch.
As a means to get there, Cloudflare on Monday launched a free observability tool for customers called AI Audit. Website owners get a dashboard to view analytics about why, when and how often AI models crawl their site for information. Cloudflare also lets customers block AI bots from their site with the click of a button. Website owners can use AI Audit to block all web scrapers, or they can let specific ones through if they have a deal or determine scraping is profitable.
A demo of AI Audit shared with TechCrunch showed how website owners can use the tool to see how AI models are scraping their site. Cloudflare's tool lets you see where each scraper visiting your site is coming from, and provides a selection window to see how many times your site has been visited by scrapers from OpenAI, Meta, Amazon, and other AI model providers.
AI Audit Demo. (Cloudflare)
Cloudflare is trying to address a looming problem in the AI industry: how can small publishers survive in the AI era if people go to ChatGPT instead of their own websites? Currently, AI model providers are collecting information from thousands of small websites to inform the basis of LLM. While some large publishers have made deals with OpenAI to license their content, most websites get nothing. But their content is still being fed into popular AI models every day. This could disrupt the business models of many websites and reduce the traffic they desperately need.
This summer, AI-powered search startup Perplexity was accused of using the Robots Exclusion Protocol to scrape websites that had purposely indicated they didn't want to be crawled, and shortly after, Cloudflare released a button that lets customers block all AI bots with one click.
“It was born out of a lot of frustration where people felt their content was being stolen,” Prince said.
Some website owners told Business Insider that AI bots are scraping their websites so often that it feels like a DDoS attack is taking down their servers. In addition to feeling bad about your website being scraped, it can literally increase your cloud bills and affect your service.
But what if you want to block Perplexity bots but not OpenAI bots? Prince told TechCrunch that Cloudflare customers want tools that let them choose which AI models can access their sites. Cloudflare's new tool, released today, lets customers block some AI crawlers while letting others through.
Prince said that even major publishers that have licensing agreements with OpenAI, such as TIME, Condé Nast, and The Atlantic, have little idea of the extent to which ChatGPT is scraping their websites. Many publishers will have to take OpenAI at face value, but the answer will determine whether they have a good license.
But Cloudflare's Marketplace, which will launch sometime next year, aims to enable smaller publishers to strike deals with AI model providers as well.
“Let's give you the ability to do what only Reddit and Quora and the big publishers of the world have done up until now,” Prince said. “What if we could essentially set a price for people to access these systems and ingest their content?”
Though it's a bold idea, Cloudflare hasn't released a fully fleshed-out idea of what the marketplace would look like. Prince said websites could charge AI model providers a fee based on the rate at which they scrape individual websites, but it's unclear how much they would actually pay. He added that websites could also charge a monetary fee for scraping or ask to be given credits by the AI lab. Details are scarce.
While AI companies may be initially reluctant to pay for content that's currently available for free, Cloudflare's CEO said he thinks this is ultimately a good thing for the AI ecosystem. Prince said the status quo of some AI companies not paying for content at all is not sustainable.