Google says its new family of AI models has an interesting feature: the ability to “identify” emotions.
The PaliGemma 2 family of models announced Thursday can analyze images, allowing AI to generate captions and answer questions about the people “in” the photo.
“PaliGemma 2 generates detailed, contextual captions for images that go beyond simple object identification to describe the action, emotion, and overall story of a scene,” Google said in a blog post shared with TechCrunch. ” he wrote.
Google says that PaliGemma 2 is based on the Gemma open model set, specifically the Gemma 2 series. Image credit: Google
Emotion recognition doesn't work out of the box, so PaliGemma 2 needs to be fine-tuned for your purposes. Nevertheless, experts TechCrunch spoke to were concerned about the possibility of emotion detectors becoming public.
“This is very concerning to me,” Sandra Wachter, professor of data ethics and AI at the Oxford Internet Institute, told TechCrunch. “I think it's problematic to think that you can 'read' people's emotions. It's like asking the Magic 8 Ball for advice. ”
For years, startups and tech giants alike have been trying to build AI that can detect emotions for everything from sales training to accident prevention. Although some claim to have achieved it, the science rests on shaky empirical evidence.
Many emotion detectors are inspired by the early work of psychologist Paul Ekman, who theorized that humans share six basic emotions: anger, surprise, disgust, joy, fear, and sadness. I am. However, subsequent research has questioned Ekman's hypothesis and shown that there are significant differences in how people from different backgrounds express their feelings.
Mike Cook, a researcher at Queen Mary University who specializes in AI, told TechCrunch: “Humans experience emotions in complex ways, which makes detecting emotions impossible in the general case.” “Of course we think we can tell by looking at what other people are feeling, and many people have tried over the years, including spy agencies and marketing companies. I'm sure it's absolutely possible to detect the signifier, but it's not something you can completely “solve”. ”
As a corollary, emotion detection systems tend to be unreliable and biased by the assumptions of their designers. In a 2020 MIT study, researchers showed that facial analysis models can create unintentional preferences for certain facial expressions, such as smiling. Recent research suggests that sentiment analysis models assign more negative emotions to Black faces than to White faces.
Google said it conducted “extensive testing” to assess PaliGemma 2's demographic bias and found “low levels of toxicity and profanity” compared to industry benchmarks. However, the company did not provide a complete list of benchmarks used, nor did it say what types of tests were performed.
The only benchmark published by Google is FairFace, which collects facial photos of tens of thousands of people. The company claims that PaliGemma 2 has received high scores on FairFace. But some researchers have criticized the benchmark as a bias indicator, noting that FairFace represents only a handful of racial groups.
“Interpreting emotions is a highly subjective issue that goes beyond the use of visual aids,” said Heidi Klaaf, chief AI scientist at the AI Now Institute, a nonprofit organization that studies the social impact of artificial intelligence. It is deeply embedded in our personal and cultural background.” . “AI aside, research shows that it is not possible to infer emotions from facial features alone.”
Emotion detection systems have drawn the ire of foreign regulators who are seeking to limit the use of the technology in high-risk situations. The AI Act, the main piece of AI legislation in the EU, prohibits schools and employers (but not law enforcement agencies) from deploying emotion detectors.
The biggest concern with open models like PaliGemma 2, available from a number of hosts including the AI development platform Hugging Face, is that they can be misused or misused to cause real-world damage.
“If this so-called emotional identification is built on pseudoscientific presumptions, this ability to further and falsely discriminate against marginalized groups in law enforcement, human resources, border control, etc. It has a significant impact on how it is used,” Klaf said.
Asked about the risks of releasing PaliGemma 2 to the public, a Google spokesperson said the company supports “expressive harm” tests for visual question answers and captions. “We conducted a robust evaluation of the PaliGemma 2 model with respect to ethics and safety, including child safety and content safety,” they added.
Watchers aren't convinced that's enough.
“Responsible innovation means thinking about results from the first day you step into the lab and continuing to think about them throughout the product lifecycle,” she said. “There are countless potential problems [with models like this] That could lead to a dystopian future, where your emotions determine whether you get a job, get a loan, or get into college. ”