Hey everyone, welcome to TechCrunch's regular AI newsletter.
This week in AI, music labels are suing two AI-powered music generator startups, Udio and Suno, for copyright infringement.
The RIAA, the trade group representing the US music recording industry, announced the lawsuit on Monday, filed by Sony Music Entertainment, Universal Music Group, Warner Records and others. The suit claims that Udio and Suno trained the generative AI models underlying their platforms on labels' music but failed to compensate the labels, and seeks $150,000 in damages for each allegedly infringing work.
“Synthetic musical output could saturate the market with machine-generated content and directly compete with, devalue and ultimately drown out the authentic voice recordings on which the Service is based,” the record companies said in their complaint.
These lawsuits join a growing number against generative AI vendors, including big names like OpenAI, that make broadly similar claims: that companies that train on copyrighted works must pay or at least credit rightsholders, and that rightsholders must be able to opt out of training if they wish. Vendors have long asserted fair use protections, arguing that the copyrighted data they use for training is publicly available and that their models create transformative works, not plagiarism.
So, what will the court rule? Well, dear reader, this is a billion-dollar question and one that will take a long time to resolve.
Given the growing evidence that generative AI models can nearly (emphasis on nearly) reproduce the copyrighted artworks, books, songs, etc. that they trained on, one might think it would be an easy win for copyright holders, but good luck to Google, who has set an important precedent with some generative AI vendors getting rid of the results.
More than a decade ago, Google began scanning millions of books to build an archive for Google Books, a sort of search engine for literary content. Authors and publishers sued Google over this practice, arguing that reproducing their intellectual property online amounted to infringement. But they lost the case. On appeal, the court ruled that Google Books' copying had a “very compelling transformative purpose.”
If plaintiffs can't prove that vendors' models are in fact plagiarists on a large scale, courts may find that generative AI also has a “compelling transformative purpose.” Or, as The Atlantic's Alex Reisner suggests, there may not be a single ruling on whether generative AI technology as a whole infringes. Judges could take into account each generated output and decide the winner on a model-by-model, case-by-case basis.
As my colleague Devin Caldway succinctly put it in an article this week, “Not all AI companies are so free to leave their traces at the scene of a crime,” and as the litigation progresses, AI vendors whose business models depend on the outcome will no doubt be taking detailed notes.
news
Advanced Voice Mode Postponed: OpenAI has postponed the release of Advanced Voice Mode, an eerily realistic, near-real-time conversational experience for its AI-powered chatbot platform ChatGPT. But OpenAI is no one idle: This week the company acquired remote collaboration startup Multi and released a macOS client for all ChatGPT users.
Stability gets a lifeline: Financially struggling Stability AI, developer of the open generative image model Stable Diffusion, has been rescued by a group of investors including Napster founder Sean Parker and former Google CEO Eric Schmidt. With its debt forgiven, the company has appointed former Weta Digital head Prem Akkaraju as its new CEO as part of a broader effort to regain its footing in the highly competitive AI industry.
Gemini comes to Gmail: Google is introducing a new Gemini-powered AI side panel to Gmail that helps you compose emails and summarize threads. The same side panel will also be coming to the rest of the search giant's suite of productivity apps, namely Docs, Sheets, Slides, and Drive.
Smashing's Talented Curators: Goodreads co-founder Otis Chandler launched Smashing, an AI and community-powered content recommendation app that aims to surface hidden gems on the internet and connect users with their interests. Smashing provides news summaries, key excerpts, and interesting quotes, automatically identifying topics and threads that interest individual users, and encouraging them to like, save, and comment on articles.
Apple Says No to Meta's AI: Days after The Wall Street Journal reported that Apple and Meta were in talks to integrate the latter's AI models, Bloomberg's Mark Gurman said the iPhone maker wasn't planning such a move. Apple has shelved the idea of putting Meta's AI on iPhones due to privacy concerns and the impression of partnering with a social network whose privacy policies are often criticized, Bloomberg said.
Research Paper of the Week
Beware of Russian-influenced chatbots, they could be right in your neighborhood.
Earlier this month, Axios covered an investigation by anti-misinformation group NewsGuard that found a major AI chatbot was spitting out snippets of a Russian propaganda campaign.
NewsGuard fed 10 leading chatbots, including OpenAI's ChatGPT, Anthropic's Claude, and Google's Gemini, dozens of prompts asking about Russian propaganda, specifically stories allegedly created by American fugitive John Mark Duggan. The company found that the chatbots responded with disinformation 32% of the time, presenting false reports written by Russians as fact.
The study highlights the increased scrutiny coming into AI vendors as the U.S. election season approaches. Microsoft, OpenAI, Google and many other major AI companies agreed at the Munich Security Conference in February to take steps to curb the spread of deepfakes and election-related misinformation. But abuse of their platforms remains rampant.
“This report concretely illustrates why the industry must pay special attention to news and information,” NewsGuard co-CEO Steven Brill told Axios. “For now, you shouldn't trust the answers most of these chatbots provide when it comes to news-related issues, especially controversial ones.”
Model of the Week
Researchers at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) claim to have developed a model, DenseAV, that can learn language by predicting what it sees from what it hears, and vice versa.
The researchers, led by Mark Hamilton, a doctoral student in electrical engineering and computer science at MIT, developed DenseAV inspired by the non-verbal ways animals communicate. “We thought maybe we needed to use audio and video to learn language,” he told MIT CSAIL's press office. “How could we get an algorithm to watch TV all day and then understand what we're saying from that?”
DenseAV works with only two types of data, audio and visual, and processes them separately, “learning” by comparing pairs of audio and visual signals to determine which signals match and which don't. Trained on a dataset of 2 million YouTube videos, DenseAV can identify objects from their names and sounds by finding and aggregating all possible matches between pixels in audio clips and images.
For example, if DenseAV hears a dog barking, one part of the model will focus on language, such as the word “dog,” while another part will focus on the dog's bark. The researchers say this shows that DenseAV can not only learn the meaning of words and the location of sounds, but also learn to distinguish between these “cross-modal” connections.
Going forward, the team aims to create a system that can learn from vast amounts of video- or audio-only data, scale up their work with larger models, and integrate it with knowledge from language understanding models to improve performance.
Grab Bag
No one could accuse OpenAI CTO Mira Murati of not being consistently forthright.
Speaking at a conference at Dartmouth College's School of Engineering, Murati acknowledged that generative AI will indeed eliminate some creative jobs, but suggested that those jobs “maybe shouldn't exist in the first place.”
“I certainly expect that we will see a lot of jobs change, some jobs will be lost, and some new jobs will be created,” she continued. “The truth is, we still don't know very much about how AI will affect jobs.”
Creators weren't pleased with Murati's comments, and rightly so: Callous rhetoric aside, OpenAI, like Udio and Suno, has faced lawsuits, criticism, and regulators for profiting from the work of artists without paying them.
OpenAI has recently promised to release tools to give creators more control over how their work is used in its products, and it continues to strike licensing deals with copyright holders and publishers, but the company isn't working to advance universal basic income or spearhead any meaningful efforts to reskill or upskill the workforce its technology impacts.
A recent Wall Street Journal article said contract work requiring basic writing, coding and translation is disappearing, and a study published last November showed that freelancers saw fewer jobs and a significant drop in income after the launch of OpenAI's ChatGPT.
OpenAI's stated mission, at least until it becomes a commercial company, is to “enable artificial general intelligence (AGI) — AI systems that are generally smarter than humans — to benefit all of humanity.” OpenAI hasn't achieved AGI yet, but it would be commendable if OpenAI stayed true to the “benefit all of humanity” part and used even a small portion of its revenue (over $3.4 billion) to pay its creators and avoid being swept away by the deluge of generative AI.
One can dream, right?