Hey everyone, welcome to TechCrunch's regular AI newsletter.
This week in AI, new research shows that generative AI isn't actually all that harmful — at least not in an apocalyptic sense.
In a paper presented to the Association for Computational Linguistics' annual conference, researchers from the Universities of Bath and Darmstadt argue that models like Meta's Llama family cannot learn independently or acquire new skills without explicit instruction.
The researchers ran thousands of experiments to test the models' ability to complete tasks they'd never encountered before, such as answering questions about topics outside the scope of their training data. They found that while the models could superficially follow instructions, they were unable to master new skills on their own.
“Our study shows that fears that models will disappear and do something totally unexpected, revolutionary and potentially dangerous are unfounded,” study co-author Harish Tayyar Madabhushi, a computer scientist at the University of Bath, said in a statement. “The common view that this kind of AI is a threat to humanity both impedes the widespread adoption and development of these technologies and distracts from the real problems that need our attention.”
The study has limitations — the researchers didn't test the latest and most powerful models from vendors like OpenAI or Anthropic, and benchmarking models can be an inexact science — but it's not the first to find that today's generative AI techniques don't pose a threat to humanity, and that assuming otherwise risks leading to unfortunate policymaking.
In an op-ed for Scientific American last year, AI ethicist Alex Hanna and linguistics professor Emily Bender argued that corporate AI labs were misdirecting regulators' attention to fictional doomsday scenarios as a bureaucratic ploy. They pointed out that OpenAI CEO Sam Altman, who appeared at a congressional hearing in May 2023, suggested, without evidence, that generative AI tools could “go very wrong.”
“The public and regulators should not be fooled by such ploys,” Hanna and Bender write, “but rather turn to scholars and activists who have conducted peer review, resisted AI hype, and sought to understand its harmful effects here and now.”
As investors continue to pour billions of dollars into generative AI and the hype cycle nears its peak, their comments, and those of Madabushi, are important points to keep in mind: There's a lot riding on the companies backing generative AI technology, and what's good for them and their backers may not be good for all of us.
Generative AI may not wipe out the human race, but it is already harming us in other ways: see the spread of non-consensual deepfake porn, false arrests via facial recognition, and hordes of underpaid data annotators. Hopefully policymakers will understand this and share or ultimately agree with this view. If not, humanity may have something to fear.
news
Google Gemini and AI buzz: Google's annual Made By Google hardware event took place on Tuesday, where the company announced a slew of updates to its Gemini assistant, as well as new phones, earbuds, and smartwatches. Check out TechCrunch's roundup for the latest updates.
AI Copyright Lawsuit Moves Forward: A class action lawsuit brought by artists who claim Stability AI, Runway AI, and DeviantArt illegally trained AI on their copyrighted works can move forward, but only partially, a presiding judge ruled Monday. In a mixed ruling, some of the plaintiffs' claims were dismissed but others survived, meaning the case may eventually go to trial.
X and Grok Issues: Elon Musk-owned social media platform X has been the target of a series of privacy complaints for using EU users' data to train AI models without users' consent. X has agreed to stop processing EU data for training Grok for the time being.
YouTube tests Gemini brainstorming: YouTube is testing an integration with Gemini to help creators brainstorm video ideas, titles, and thumbnails. The feature, called “Brainstorm with Gemini,” is currently available to select creators as part of a small, limited experiment.
OpenAI's GPT-4o does weird things: OpenAI's GPT-4o is the company's first model trained on audio, not just text and image data, so it can do strange things, like mimicking the voice of people talking to it or suddenly yelling in the middle of a conversation.
Research Paper of the Week
There are a number of companies out there offering tools that claim to be able to reliably detect text written by generative AI models, which would help, for example, to fight misinformation and plagiarism. But when I tested some of them a while back, the tools barely worked, and new research suggests that the situation hasn't improved much.
To measure the performance of AI text detectors, researchers at the University of Pennsylvania designed the Robust AI Detector (RAID), a dataset and leaderboard of over 10 million AI-generated and human-written recipes, news articles, and blog posts. The researchers found that the detectors they evaluated were “largely useless” (the researchers' words) — they only worked when applied to specific use cases or to text similar to the text they were trained on.
“If universities and schools were to rely on a limited number of trained detectors to detect students' internet use, [generative AI] “Taking too much time to write an assignment could lead to students being misidentified as cheating when they weren't,” Chris Callison Burch, a co-author of the study and professor of computer and information sciences, said in a statement. “It could also lead to students missing who are cheating in other ways.” [generative AI] To create homework.”
There doesn't seem to be a silver bullet when it comes to AI text detection – it's a hard problem to solve.
OpenAI itself has reportedly developed a new text detection tool for its AI models that's an improvement over the company's first attempt, but has declined to release it for fear that it could disproportionately affect non-English-speaking users and be rendered ineffective by minor changes to the text. (On the more charitable side, OpenAI is also said to be concerned about how its built-in AI text detector could affect the perception and use of its own products.)
Model of the Week
It looks like generative AI can be useful for more than just memes: MIT researchers are applying it to detect problems in complex systems like wind turbines.
A team at MIT's Computer Science and Artificial Intelligence Laboratory developed a framework called SigLLM. This framework includes a component that converts time series data (repeated measurements over a period of time) into text-based input that can be processed by a generative AI model. Users can feed these prepared data to the model and ask it to start identifying anomalies. The model can also be used to forecast future time series data points as part of an anomaly detection pipeline.
The framework didn't perform particularly well in the researchers' experiments, but if its performance could be improved, SigLLM could help technicians warn of potential problems with heavy machinery or other equipment before they occur, for example.
“This is just the first iteration, so we didn’t expect to get that far initially, but these results show that there is an opportunity here that can be leveraged. [generative AI models] “It's ideal for complex anomaly detection tasks,” Sarah Arnehamisch, a graduate student in electrical engineering and computer science and lead author of the SigLLM paper, said in a statement.
Grab Bag
OpenAI upgraded its AI-powered chatbot platform ChatGPT to a new base model this month, but hasn't published a changelog (well, barely any changelog at all).
Since last week, we've been introducing a new GPT-4o model on ChatGPT. I hope you all enjoy it. If you haven't tried it yet, check it out. I'm sure you'll like it 😃
— ChatGPT (@ChatGPTapp) August 12, 2024
So how do we interpret it? What, exactly? We have nothing to rely on other than anecdotal evidence from subjective testing.
I think Ethan Mollick, a professor of AI, innovation, and startups at Wharton, is right: writing release notes for generative AI models is hard because the models “feel” differently with each interaction, and it's mostly based on mood. But people are using ChatGPT and paying for it, so surely they have a right to know what they're working on?
The improvements may be incremental, and OpenAI believes it would be unwise to suggest this for competitive reasons. It is unlikely that this model has any bearing on OpenAI's reported inference breakthroughs. In any case, transparency should be a priority when it comes to AI. Without transparency, there can be no trust. And OpenAI has already lost much of it.