What do MrBeast, John Oliver, and The Wall Street Journal have in common? Transcripts of their YouTube videos are collected and used to train AI used by companies like Anthropic, Nvidia, Apple, and Salesforce.
An investigation by Wired and Proof News found that the dataset, called “YouTube Subtitles,” contains transcripts of over 173,000 YouTube videos from over 48,000 different channels.
This AI scraping has become an issue across the tech industry: Jingna Zhang, artist and founder of the app Cara, has tried to protect artists by building social platforms that don't sell them out, and the University of Chicago is working on Nightshade, which can limit the information AI can collect by “poisoning” images.
But is there really any way for creators to protect themselves from being attacked next? Read the TechCrunch Minute to find out more.