Hey everyone, welcome to TechCrunch's regular AI newsletter.
This week in AI, generative AI has begun spamming academic publications, a disturbing new development in the field of disinformation.
In a post for Retraction Watch, a blog that tracks recent retractions of academic research, assistant professors of philosophy Tomasz Juraczyk and Leszek Wronski wrote about three journals published by Addleton Academic Publishers that appear to be composed entirely of AI-generated papers.
The journals publish papers that follow the same template, chock-full of buzzwords like “blockchain,” “metaverse,” “internet of things,” “deep learning,” etc., list the same editorial boards (ten of whom are deceased) and an unassuming address that appears to be a residential address in Queens, New York.
So what’s the problem, you may ask? Isn’t encountering AI-generated spam content just a cost of doing business on the internet these days?
Yes, but the fake journals show how easy it is to manipulate the systems used to evaluate researchers for promotion and hiring – and this could be a harbinger for knowledge workers in other industries.
At least according to one widely used rating system, CiteScore, these journals are ranked in the top 10 in the field of philosophical research. How is this possible? Because they frequently cross-cite each other (which CiteScore takes into account in its calculations). Żuradzk and Wroński found that of 541 citations in one of Addleton's journals, 208 were citations to other fake publications by the publisher.
“[These rankings] “They frequently serve as indicators of research quality for universities and funding bodies,” Żuradzk and Wroński write. “They play an important role in decisions regarding academic awards, hiring, and promotion, and can influence researchers' publication strategies.”
Some will argue that CiteScore is the problem. Clearly, it is a flawed metric. And that’s not a false claim. But it’s also not wrong to say that generative AI and its misuse are disrupting the systems that underpin our lives in unforeseen and potentially very harmful ways.
One future is one in which generative AI allows us to rethink systems like CiteScore and redesign them to be more fair, holistic, and inclusive. The more dire alternative, and the one we’re currently living in, is one in which generative AI continues to run wild, wreaking havoc and ruining the lives of professionals.
Hopefully we can get back on track soon.
news
DeepMind Soundtrack Generator: DeepMind, Google's AI research lab, announced that it is developing AI technology to generate soundtracks for videos. DeepMind's AI will combine a description of the soundtrack (e.g., “jellyfish, marine life, ocean pulsating underwater”) with the video to create music, sound effects, and even dialogue that matches the characters and tone of the video.
Robot Driver: Researchers at the University of Tokyo have developed and trained a “musculoskeletal humanoid” called Musashi to drive a small electric car on a test track. Musashi is equipped with two cameras that act as human eyes, allowing it to “see” not only the road ahead, but also the scenery reflected in the car's side mirrors.
New AI Search Engine: Genspark is a new AI-powered search platform that uses generative AI to create custom summaries in response to search queries. The company has raised $60 million so far from investors including Lanchi Ventures, and its last funding round valued it at $260 million post-money, a respectable figure as Genspark competes with rivals like Perplexity.
How much does ChatGPT cost?: How much does ChatGPT, OpenAI's ever-expanding AI-powered chatbot platform, cost? This is a harder question to answer than you might think. To keep track of the different ChatGPT subscription options available to you, we've put together an up-to-date guide on ChatGPT pricing.
Research Paper of the Week
Autonomous cars face a variety of edge cases depending on where they are and what the situation is. If you're driving on a two-lane road and someone turns on their left turn signal, does that mean they're changing lanes? Or that you should pass? The answer may be different depending on whether you're on I-5 or the Autobahn.
Incredibly, a group of researchers from Nvidia, USC, UW, and Stanford University show in a paper just published at CVPR that having an AI read the local driver's manual can resolve many ambiguous or unusual situations.
The company's Large Language Driving Assistant (LLaDa) gives LLMs access to (and even fine-tunes to) state, national, and local driving manuals. Local rules, customs, and signs are captured in the manuals, and unexpected situations like horns, high beams, and flocks of sheep generate appropriate actions (pull over, stop and turn, honk back) when they occur.
Image credit: Nvidia
While this is by no means a perfect end-to-end driving system, it still represents an alternative path to a “universal” driving system that would encounter surprises, and perhaps even a way for the rest of us to know why the horn is honking when we visit unknown places.
Model of the Week
On Monday, Runway, a company that develops generative AI tools for film and visual content creators, announced Gen-3 Alpha. Trained on a vast array of images and videos from both public and internal sources, Gen-3 can generate video clips from text descriptions and still images.
Runway says Gen-3 Alpha offers a “significant” increase in generation speed and fidelity over Runway's previous flagship video model, Gen-2, and gives users greater control over the structure, style and movement of the videos they create. Runway also says Gen-3 can be customized to target “specific artistic and narrative requirements” to create more “stylistically controlled” and consistent characters.
Gen-3 Alpha does have some limitations, such as being limited to a maximum of 10 seconds of footage, but Runway co-founder Anastasis Germanidis promises that this is just the first of several video generation models to come in a family of next-generation models trained on Runway's upgraded infrastructure.
Gen-3 Alpha is just the latest of several video generation systems that have emerged in recent months — others include OpenAI's Sora, Luma's Dream Machine and Google's Veo — that together, if they can overcome copyright issues, threaten to upend the film and TV industry as we know it.
Grab Bag
Your next McDonald's order won't be accepted by AI.
McDonald's said this week that it would remove automated order-taking technology from more than 100 restaurants after nearly three years of testing it. The technology, developed with IBM and installed in its restaurant drive-thrus, made headlines last year for being prone to misinterpreting customers and making errors.
According to a recent article in Takeaway, AI appears to be losing its grip on fast-food restaurants, which not long ago were enthusiastic about AI technology and its potential for increased efficiency (and reduced labor costs): Presto, a leader in the field of AI-assisted drive-thru lanes, recently lost a major customer, Del Taco, and is facing mounting losses.
The problem is inaccuracy.
McDonald's CEO Chris Kempczinski told CNBC in June 2021 that the company's voice recognition technology is about 85% accurate, but that about one-fifth of orders still require a human staff member to assist. Meanwhile, the takeaway said that the best version of Presto's system can complete only about 30% of orders without human assistance.
While AI is disrupting certain sectors of the gig economy, some jobs, such as those that require understanding diverse accents and dialects, seem unable to be automated — at least for now.