Google's main privacy regulator in the European Union has opened an investigation into whether the company complies with EU data protection law regarding its use of personal information in training its generative AI.
Specifically, it is investigating whether the tech giants should have carried out Data Protection Impact Assessments (DPIAs) to proactively consider the risks that AI technology may pose to the rights and freedoms of individuals whose information was used to train their models.
Generative AI tools are notorious for generating plausible-sounding lies. That tendency, combined with their ability to provide personal information on demand, creates significant legal risks for their creators. The Irish Data Protection Commission (DPC), which oversees Google's compliance with the EU's General Data Protection Regulation (GDPR), has the power to impose fines of up to 4% of Alphabet's (Google's parent company) global annual turnover if a violation is found.
Google has developed several generative AI tools, including an entire family of general-purpose large-scale language models (LLMs) branded Gemini (formerly Bard). The company uses the technology to power AI chatbots and also to enhance web search. The foundation of these consumer AI tools is the Google LLM, now called PaLM2, which Google announced at its I/O developer conference last year.
The Irish DPC says it is investigating how Google developed the underlying AI model under Section 110 of the Irish Data Protection Act 2018, which transposed the GDPR into domestic law.
Training GenAI models typically requires vast amounts of data, and the type of information acquired by LLM manufacturers, as well as how and where it is acquired, is coming under increasing scrutiny in relation to a range of legal concerns, including copyright and privacy.
In the latter case, any information used as material for AI training, including personal data of EU people, will be subject to EU data protection regulations, whether it is collected from the public internet or obtained directly from users. As such, many LLMs, including OpenAI, the makers of GPT (and ChatGPT), and Meta, which develops the Llama AI model, are already facing questions related to privacy compliance and the enforcement of the GDPR.
Elon Musk-owned X has also attracted GDPR complaints and the wrath of the DPC for using people's data for AI training, leading to a court case in which X undertook to limit data processing but has not been sanctioned. However, X could face GDPR fines if the DPC finds that X's processing of user data to train its AI tool Grok breaches GDPR.
The DPC’s DPIA investigation into Google’s GenAI is the latest regulatory action in this area.
“The statutory investigation concerns the issue of whether Google complied with its obligation to carry out an assessment pursuant to Article 35 of the General Data Protection Regulation (Data Protection Impact Assessment) before engaging in the processing of personal data of EU/EEA data subjects relevant to the development of its underlying AI model, Pathways Language Model 2 (PaLM 2),” the DPC said in a press release.
It notes that DPIAs can be “crucial in ensuring that the fundamental rights and freedoms of individuals are appropriately considered and protected where the processing of personal data may result in high risks.”
“This statutory investigation is part of the DPC’s wider work in collaboration with the EU/EEA. [European Economic Area] “We need to cooperate with our EU counterparts in regulating the processing of personal data of EU/EEA data subjects in the development of AI models and systems,” the DPC added, noting ongoing efforts by the EU's network of GDPR enforcement authorities to reach some consensus on how to most effectively apply privacy law to GenAI tools.
Google has been contacted for a response to the DPC's queries.