OpenAI announced in May that it was developing a tool that would allow creators to specify how their work is included or excluded from AI training data. But after seven months, this feature still hasn't seen the light of day.
The tool, called Media Manager, “identifies copyrighted text, images, audio, and video” and reflects creators' preferences “across multiple sources,” OpenAI said at the time. This was intended to fend off some of the company's fiercest critics and potentially shield OpenAI from intellectual property-related legal challenges.
But sources told TechCrunch that the tool was hardly considered a major announcement within the company. “I don’t think that was a priority,” said one former OpenAI employee. “To be honest, I don't remember anyone working on it.”
A non-employee who coordinates work with the company told TechCrunch in December that they had discussed the tool with OpenAI in the past, but there had been no recent updates. (These people declined to disclose that they were discussing confidential business matters.)
Additionally, Fred von Lohmann, a member of OpenAI's legal team and media manager, transitioned to a part-time consultant role in October. An OpenAI spokesperson confirmed von Lohmann's move to TechCrunch via email.
OpenAI has yet to provide updates on Media Manager's progress, and the company has missed its self-imposed deadline to deploy the tool by 2025.
Intellectual property issues
AI models like OpenAI learn patterns in a set of data to make predictions. For example, it predicts that when a person bites into a hamburger, there will be a bite mark. This allows the model to learn, to some extent, how the world works by observing it. ChatGPT can write persuasive emails and essays, and OpenAI's video generator Sora can create relatively realistic footage.
The ability to use examples from writing, movies, etc. to create new works makes AI incredibly powerful. But it's also regurgitant. When prompted in a certain way, the model (most of which has been trained on countless web pages, videos, and images) produces a close copy of that data, which is '', but it is not intended to be used in this way.
For example, Sora can generate clips featuring the TikTok logo or characters from popular video games. The New York Times let ChatGPT quote the article verbatim (OpenAI blamed the action on “hacking”).
This is understandably upsetting to creators whose work has been passed on to AI training without their permission. Many people started their own lawyers.
OpenAI is fighting a class-action lawsuit brought by artists, writers, YouTubers, computer scientists, and news organizations, all of whom claim the startup illegally trained their work. Plaintiffs include writers Sarah Silverman and Ta Nehisi-Coates, visual artists, and media conglomerates such as the New York Times and Radio-Canada, to name a few.
OpenAI has been pursuing licensing deals with select partners, but not all creators find the terms attractive.
OpenAI offers creators several ad hoc ways to “opt out” of AI training. Last September, the company launched a submission form that allows artists to flag their work for removal from future training sets. OpenAI has also long allowed webmasters to block web-crawling bots from scraping data across domains.
However, producers criticize these methods as ad hoc and insufficient. There is no specific opt-out mechanism for copyrighted materials, videos, or audio recordings. Additionally, the image opt-out form requires you to submit a copy of each image you want removed along with a description, a tedious process.
Media Manager was proposed today as a complete revamp and expansion of OpenAI's opt-out solution.
OpenAI said in a May announcement that Media Manager will leverage “cutting-edge machine learning research” to enable creators and content owners to “inform.” [OpenAI] What they own. ” OpenAI claimed to be working with regulators in developing the tool and said it expects Media Manager to “set standards across the AI industry.”
Since then, OpenAI has not publicly mentioned Media Manager.
A spokesperson told TechCrunch that the tool was “still in development” as of August, but did not respond to a follow-up request for comment in mid-December.
OpenAI gives no indication when the Media Manager will start, or even with what features.
fair use
Assuming Media Manager ever arrives, experts aren't convinced it will allay creators' concerns or significantly contribute to resolving legal issues surrounding the use of AI and IP.
Adrian Cyhan, an IP lawyer at Stubbs Alderton & Markiles, says the Media Manager described here is an ambitious undertaking. Even large platforms like YouTube and TikTok struggle with content ID at scale. Can OpenAI really do better?
“Ensuring compliance with legally mandated creator protections and potential remuneration requirements under consideration is a rapidly changing legal landscape, particularly between national and local jurisdictions,” Saihan told TechCrunch. “This poses a challenge, given that they have the potential to evolve and diversify.”
Media Manager unfairly shifts the burden of managing AI training to creators, said Ed Newton Rex, founder of Fairly Trained, a nonprofit that certifies that AI companies respect the rights of creators. I think it will happen. By not using it, you are probably tacitly authorizing your work to be used. “Most creators have never even heard of it, let alone used it,” he told TechCrunch. “But it will still be used to protect the mass exploitation of creative works against the wishes of their creators.”
Mike Borella, co-chair of MBHB's AI practice group, pointed out that opt-out systems do not necessarily take into account transformations that may be made to a work, such as downsampled images. It also may not address the common scenario where a third-party platform hosts a copy of a creator's content, said Joshua Weigensberg, an IP and media attorney at Pryor Cashman. added.
“Creators and copyright holders do not control, and often do not know, where their work appears on the Internet,” Weigensberg said. “Even if creators tell all AI platforms they want to opt out of training, those companies may still continue to train with copies of their work available on third-party websites and services. ”
Media Manager may not be particularly advantageous for OpenAI, at least from a legal perspective. Evan Everist, a partner at Dorsey & Whitney who specializes in copyright law, told the judge that OpenAI is using this tool to ease training on content protected by intellectual property. He said that if the media manager suffered damage, the media manager would not be able to protect the company from damage. was found to have been infringed.
“Copyright owners have no obligation to proactively tell others not to infringe on their works before infringement occurs,” Everist said. “The basics of copyright law still apply: don’t take or copy someone else’s stuff without permission. This feature is designed for PR and to position OpenAI as an ethical user of content. There may be.”
calculation
Without Media Manager, OpenAI implemented an imperfect filter to prevent the model from regurgitating training samples. And in its ongoing lawsuits, the company continues to claim fair use protections, arguing that its models create innovative work rather than plagiarism.
OpenAI has a good chance of winning a copyright dispute.
Following the precedent set in the publishing industry's lawsuit against Google nearly a decade ago, a court could decide that the company's AI has a “transformative purpose.” In this case, the court ruled that it was permissible for Google to copy millions of books for Google Books, a type of digital archive.
OpenAI has publicly stated that it is “impossible” to train competitive AI models without using copyrighted material (licensed or not). “Limiting training data to public domain books and drawings created more than a century ago may provide interesting experiments, but it will not provide an AI system that meets the needs of today's population,” the company said. said in a January submission to the House of Lords. .
If the courts ultimately declare in favor of OpenAI, Media Manager will serve less of a legal purpose. It appears that OpenAI is willing to take that gamble or reconsider its opt-out strategy.