Close Menu
TechBrunchTechBrunch
  • Home
  • AI
  • Apps
  • Crypto
  • Security
  • Startups
  • TechCrunch
  • Venture

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

Early AI investor Elad Gil found his next big bet: AI-powered rollup

June 1, 2025

TC Session: AI Trivi Account Down – Big Score with Tickets

June 1, 2025

4 days left: TC session: AI is almost in session

June 1, 2025
Facebook X (Twitter) Instagram
TechBrunchTechBrunch
  • Home
  • AI

    OpenAI seeks to extend human lifespans with the help of longevity startups

    January 17, 2025

    Farewell to the $200 million woolly mammoth and TikTok

    January 17, 2025

    Nord Security founder launches Nexos.ai to help enterprises move AI projects from pilot to production

    January 17, 2025

    Data proves it remains difficult for startups to raise capital, even though VCs invested $75 billion in the fourth quarter

    January 16, 2025

    Apple suspends AI notification summaries for news after generating false alerts

    January 16, 2025
  • Apps

    Google quietly released an app that allows you to download and run AI models locally

    May 31, 2025

    A guide to using editing, Meta's new Capcut Rival for Short-Form video editing

    May 31, 2025

    Automattic says it will start contributing to WordPress again after pause

    May 30, 2025

    The last day to vote to destroy the 2025 agenda lineup

    May 30, 2025

    6 days left: Ready for the truth about unfiltered AI in TC sessions: AI?

    May 30, 2025
  • Crypto

    GameStop bought $500 million in Bitcoin

    May 28, 2025

    Vote for the session you want to watch in 2025

    May 26, 2025

    Save $900 + 90% from 2 tickets to destroy 2025 in the last 24 hours

    May 25, 2025

    Only 3 days left to save up to $900 to destroy the 2025 pass

    May 23, 2025

    Starting from up to $900 from Ticep, 90% off +1 in 2025

    May 22, 2025
  • Security

    8 things we learned from WhatsApp vs. NSO Group Spyware Litigation

    May 30, 2025

    White House investigates how Trump's chief staff's phone was hacked

    May 30, 2025

    US government sanctions technology company involved in cyber fraud

    May 29, 2025

    Ten years later, the bootstrap Thinkst Canary will reach $20 million ARR without VC funding

    May 29, 2025

    Security Startup Horizon3.AI raises $100 million in new rounds

    May 28, 2025
  • Startups

    7 days left: Founders and VCs save over $300 on all stage passes

    March 24, 2025

    AI chip startup Furiosaai reportedly rejecting $800 million acquisition offer from Meta

    March 24, 2025

    20 Hottest Open Source Startups of 2024

    March 22, 2025

    Andrill may build a weapons factory in the UK

    March 21, 2025

    Startup Weekly: Wiz bets paid off at M&A Rich Week

    March 21, 2025
  • TechCrunch

    OpenSea takes a long-term view with a focus on UX despite NFT sales remaining low

    February 8, 2024

    AI will save software companies' growth dreams

    February 8, 2024

    B2B and B2C are not about who buys, but how you sell

    February 5, 2024

    It's time for venture capital to break away from fast fashion

    February 3, 2024

    a16z's Chris Dixon believes it's time to focus on blockchain use cases rather than speculation

    February 2, 2024
  • Venture

    Early AI investor Elad Gil found his next big bet: AI-powered rollup

    June 1, 2025

    TC Session: AI Trivi Account Down – Big Score with Tickets

    June 1, 2025

    4 days left: TC session: AI is almost in session

    June 1, 2025

    TC Session: AI Trivi Account Down – Next Shot in Big

    May 31, 2025

    The counted conversation begins in 5 days in a TC session: ai

    May 31, 2025
TechBrunchTechBrunch

YouTubers file class action lawsuit over OpenAI scraping creator transcripts

TechBrunchBy TechBrunchAugust 5, 20244 Mins Read
Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
Share
Facebook Twitter LinkedIn Pinterest Telegram Email


YouTube creators are filing a class action lawsuit against OpenAI, alleging that the company used millions of transcripts of YouTube videos to train a generative AI model without notifying or compensating video owners.

In a complaint filed last Friday in the U.S. District Court for the Northern District of California, lawyers for Massachusetts-based YouTube user David Millett allege that OpenAI secretly transcribed Millett and other creators' videos to train models that power its AI-powered chatbot platform ChatGPT and other AI-generated tools and products. By collecting this data, OpenAI “significantly profited” from the creators' work, but violated copyright law and YouTube's terms of service, which prohibit the use of the videos in apps unrelated to YouTube's service, according to the complaint.

“As [OpenAI’s] As AI products become more sophisticated through the use of training datasets, they become more valuable to potential and current users who purchase subscriptions to access them. [OpenAI’s] “This is an OpenAI AI product,” the complaint reads, “but much of the material contained in OpenAI's training datasets comes from works that OpenAI copied without consent, credit, or compensation.”

Millett, who is represented by the law firm Burser & Fisher, is seeking a jury trial and more than $5 million in damages from all YouTube users whose data may have been collected during OpenAI's training.

Generative AI models like OpenAI's have no actual intelligence: you feed them a huge number of examples (movies, audio recordings, essays, etc.) and the model “learns” the likelihood of the data occurring based on patterns (including the context of the surrounding data).

Most models are trained with data taken from public websites or web datasets. Companies argue that fair use protects their efforts to indiscriminately collect data and use it to train commercial models. But many copyright holders disagree and have filed lawsuits to block the practice.

In a sense, video transcriptions have become an important component of training data as other data sources become scarce.

According to data from Originality.AI, more than 35% of the world's top 1,000 websites currently block OpenAI's web crawlers. And a study by MIT's Data Provenance Initiative found that roughly 25% of “high-quality” sources of data are restricted from major datasets used to train AI models. If current trends in access blocking continue, research group Epoch AI predicts that developers will run out of data to train generative AI models between 2026 and 2032.

The New York Times reported in April that OpenAI created its first speech recognition model, Whisper, to transcribe video audio and gather additional training data. According to the Times, the OpenAI team, including OpenAI president Greg Brockman, used Whisper to transcribe more than 1 million hours of video from YouTube and then used the transcripts to train OpenAI's text generation and analysis model, GPT-4.

According to the Times, some OpenAI staffers discussed how such a move could violate YouTube's rules.

In July, Proof News reported that companies including Anthropic, Apple, Salesforce, and Nvidia had trained generative AI models using a dataset called The Pile, which contains hundreds of thousands of subtitles for YouTube videos. Many YouTube creators whose subtitles were included in The Pile were unaware of this or consented to it. Apple later issued a statement saying it had no intention of using these models for AI features in its products.

YouTube's parent company, Google, is also looking to use transcripts to train models.

Last year, Google expanded its Terms of Service (ToS) to allow it to use more user data to train its generative AI models. The old ToS left it unclear whether Google could use YouTube data to build products outside of its video platform. The new ToS is much more open and clear.

We have reached out to OpenAI and Google for comment on the class action lawsuit and will update this article if we hear back.

It's been a rough start to the month for OpenAI.

Tesla and X CEO Elon Musk filed a new lawsuit against OpenAI and CEO Sam Altman on Monday, accusing the company of abandoning its non-profit mission by reserving some of its most sophisticated technology for commercial customers. Musk made the same claims in a lawsuit he filed against OpenAI in February, but the new suit similarly alleges OpenAI has engaged in fraudulent practices.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Posts

OpenAI seeks to extend human lifespans with the help of longevity startups

January 17, 2025

Farewell to the $200 million woolly mammoth and TikTok

January 17, 2025

Nord Security founder launches Nexos.ai to help enterprises move AI projects from pilot to production

January 17, 2025

Data proves it remains difficult for startups to raise capital, even though VCs invested $75 billion in the fourth quarter

January 16, 2025

Apple suspends AI notification summaries for news after generating false alerts

January 16, 2025

Nvidia releases more tools and guardrails to help enterprises adopt AI agents

January 16, 2025

Leave A Reply Cancel Reply

Top Reviews
Editors Picks

7 days left: Founders and VCs save over $300 on all stage passes

March 24, 2025

AI chip startup Furiosaai reportedly rejecting $800 million acquisition offer from Meta

March 24, 2025

20 Hottest Open Source Startups of 2024

March 22, 2025

Andrill may build a weapons factory in the UK

March 21, 2025
About Us
About Us

Welcome to Tech Brunch, your go-to destination for cutting-edge insights, news, and analysis in the fields of Artificial Intelligence (AI), Cryptocurrency, Technology, and Startups. At Tech Brunch, we are passionate about exploring the latest trends, innovations, and developments shaping the future of these dynamic industries.

Our Picks

Early AI investor Elad Gil found his next big bet: AI-powered rollup

June 1, 2025

TC Session: AI Trivi Account Down – Big Score with Tickets

June 1, 2025

4 days left: TC session: AI is almost in session

June 1, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

© 2025 TechBrunch. Designed by TechBrunch.
  • Home
  • About Tech Brunch
  • Advertise with Tech Brunch
  • Contact us
  • DMCA Notice
  • Privacy Policy
  • Terms of Use

Type above and press Enter to search. Press Esc to cancel.