The Associated Press reports that software engineers, developers, and academic researchers have serious concerns about OpenAI's transcriptions from Whisper.
There's a lot of discussion about generative AI's tendency to hallucinate (and thus make things up), so it's a little surprising that this is a problem with transcription. Transcription is expected to closely follow the audio being transcribed.
Instead, researchers told the AP, Whisper introduces everything from racial commentary to imagined medical practices into the records. And this could be particularly disastrous if Whisper is employed in hospitals or other medical settings.
Researchers at the University of Michigan studying public meetings found hallucinations in 8 out of 10 audio transcriptions. A machine learning engineer studied over 100 hours of Whisper transcriptions and found hallucinations in more than half of them. Additionally, one developer reported finding hallucinations in nearly all of the 26,000 transcriptions he created with Whisper.
An OpenAI spokesperson said the company is “continually working to improve the accuracy of our models, including reducing hallucinations,” and that the company's usage policy calls for using Whisper in “certain high-risk decision-making situations.” He said it is prohibited to use it.
“We would like to thank the researchers for sharing their findings,” they said.