In the never-ending battle between Google and France's competition authorities over copyright protection of news snippets, a court ruled Wednesday against the tech giant for 250 million euros (about $270 million at current exchange rates). ) announced that they would be fined.
Competition watchdogs say Google has ignored some of its previous promises to news publishers. But what makes this decision particularly noteworthy is that Google is using news publishers' content to train its generative AI model Bard/Gemini, something new. This is because it will be dropped.
The competition authority found Google negligent in failing to notify news publishers about GenAI's use of their copyrighted content. This builds on previous commitments by Google to ensure it negotiates fair payments with publishers for content reuse.
Copyright and Contest Misconduct
In 2019, the European Union passed a pan-EU digital rights reform that extends copyright protection to news headlines and excerpts. News aggregators such as Google News, Discover, and the “Top Stories” feature box on search results pages have traditionally scraped these news articles and displayed them in their products without receiving financial compensation. Ta.
Google initially tried to circumvent the law by suspending Google News in France. However, competition authorities quickly intervened, determining that the unilateral action was an abuse of a dominant market position that risked harm to the publisher. The intervention effectively forced Google to terminate its agreements with local publishers to reuse content. But in 2021, Google was fined $592 million after competition authorities found serious violations in its negotiations with local publishers and agencies.
The tech giant called the sanctions “disproportionate” and said it would appeal. But then the government offered a series of commitments, withdrew the appeal and sought to resolve the dispute. The promise, accepted by French authorities, includes handing over important information to the publisher and negotiating in a fair manner.
Google has copyright agreements with hundreds of publishers in France, which are subject to the agreement with Autorité. As such, business in this area is highly regulated.
No objections
Google agreed not to contest Autorité's latest findings in exchange for expedited processing and payment.
But Surina Konal, managing director of news and publishing partnerships, expressed displeasure in a lengthy blog post: “The fines are not proportionate to the issues raised by the authorities.”
The blog post suggests that Google really wants to draw a line under this story this time around, with Connall also writing: The goal is a sustainable approach that connects people with quality content and works constructively with French publishers. ”
With generative AI in the fold and increased competition to launch tools, Google's calculus on how to approach the content reuse problem looks different.
In-frame GenAI training
Today's enforcement action by the French competition authority means Google will use content from news publishers and agencies for training purposes for its AI-based models and related AI chatbot service Bard (now called Gemini). It shows that you were focused.
According to a press release, Google used content from publishers and news organizations “without notifying copyright holders or authorities” to train Bard, a generative AI tool launched in July 2023. It has been found.
Google's defenses in this regard are twofold. In a blog post, the competition authority said it “does not object to the way web content is used to improve new products, such as generative AI. This is already addressed in Article 4 of the EUCD.” It is written. [EU Copyright Directive].
Article 4 of the Copyright Directive provides for “exceptions or limitations to text and data mining”, in particular “the reproduction and extraction of legally accessible works and other subject matter for the purpose of text and data mining”. Masu.
However, Outrite claims in a press release that it has not yet been determined whether the exemption applies here. (It is worth noting that the relevant clause refers to “lawfully accessible works.”) Google, on the other hand, has no legal obligation to notify copyright holders about uses of protected works. (The company has a duty to a certain competition authority, but in this case it clearly failed to do so.)
“When it comes to declaring whether the use of news content to train artificial intelligence services falls under neighboring rights and protections, this question remains unanswered,” the competition authority wrote. “However, Autorité believes that Google violated that promise number one by failing to notify publishers that their content was used to train Bard.”
Google's blog post also mentions EU AI law, suggesting it is relevant. However, the law has not yet come into force as it awaits final adoption by the European Council.
The upcoming AI bill also states that developers must follow block copyright rules. And with that goal in mind, it introduces transparency requirements and requires policies to respect EU copyright law. Publish a “sufficiently detailed overview” of the content used to train general purpose AI models (e.g. Gemini/Bard).
This new requirement for model makers to publish summaries of their training data will ensure that in the future, news publishers whose protected content is ingested for GenAI training will be able to receive fair remuneration under EU copyright law. It may be easier to obtain.
No technical opt-out
Autorite also announced that Google will provide a technical solution that will allow publishers and news organizations to opt out of having their content used for bird training until at least September 28, 2023, without affecting the display of their content. It also points out that they were unable to provide it. Other Google services.
“Until today, publishers and news organizations that wanted to opt out of this use case had to insert instructions to block indexing of all content from Google, including Search, Discover, and Google News services. “In the future, Autorité will carefully consider the effectiveness of Google's opt-out process.”
In more technical terms, from July to September 2023, news publishers will insert a “noindex” tag in their robots.txt files to ensure that their content is not used to train Google's AI models. You will be able to do it. This robots.txt file is located in the root folder of your web server and contains various instructions for search engines. Google's web crawler examines the instructions in these files to index your website.
However, the “noindex” tag means your website will disappear from Google completely. In September 2023, Google became even more granular and created the “Google-Extended” rule, which is different from the “noindex” rule. By opting out of Google Extensions, web publishers indicate that they do not want to help improve Gemini's current and future models.
Other disadvantages
Autorite also accused Google of failing to provide all the information it needed to ensure fair compensation negotiations for its content on a number of other issues related to how it negotiated with French news publishers. It is punishing.
In a press release, the company said Google's information to publishers about how it calculates the amount owed to publishers is “particularly opaque.”
It also found that Google did not meet non-discrimination standards aimed at ensuring equal treatment for publishers. And Autorite criticized Google's decision to impose a “minimum threshold” on compensation, below which it would not pay publishers at all, saying it introduced discrimination among publishers “in very principle.” I explained that it was something I would do. The press release states that below a certain threshold, all publishers will be “arbitrarily assigned zero remuneration, regardless of their individual circumstances.”
Additionally, Autorite found that Google's calculations regarding so-called “indirect revenue” were flawed, saying the proposed “package” did not comply with previous rulings or Court of Justice appeal decisions from October 2020.
It also said Google did not follow through on its commitment to update the compensation agreement in line with its commitments.