Noisy recordings of interviews and speeches are the bane of any audio engineer's existence. But a German startup hopes to solve this problem with a unique technological approach that uses generative AI to improve the clarity of audio in videos.
Today, AI-coustics has emerged from stealth with €1.9 million in funding. According to co-founder and CEO Fabian Seipel, AI-coustics' technology goes beyond standard noise suppression and works across and in conjunction with any device or speaker.
“Our core mission is to make every digital interaction – whether it's a conference call, a consumer device, or a casual social media video – as clear as broadcast from a professional studio,” Seipel told TechCrunch. said in an interview.
Seipel, an audio engineer by training, co-founded AI-coustics in 2021 with Colvin Jadicke, a lecturer in machine learning at the Technical University of Berlin. Seipel and Jadicke met while studying audio technology at the Technical University of Berlin, where they frequently encountered poor audio quality. The quality of the online courses and tutorials they had to take.
“We have been driven by a personal mission to overcome the pervasive challenge of poor audio quality in digital communications,” Seipel said. “My hearing is slightly impaired by his music production in my early twenties, but I've always struggled with his content and lectures online. That's why we focused on the topic of audio quality and clarity in the first place. I decided to work on it.”
The market for AI-powered noise suppression and voice enhancement software is already very robust. AI-coustics' competitors include Insoundz, which uses generative AI to enhance streaming and pre-recorded audio clips, and Veed.io, a video editing suite with tools to remove background noise from clips.
But Seipel says AI acoustics has a unique approach to developing AI mechanisms that do the actual noise reduction work.
The startup uses models trained on audio samples recorded at the startup's studio in Berlin, the home of AI acoustics. People are paid to record samples, though Seipel didn't say how much, that are added to a dataset used to train AI acoustics noise reduction models.
“We have developed a unique approach that simulates audio artifacts and issues (such as noise, reverberation, compression, band-limited microphones, distortion, and clipping) during the training process,” said Seipel. Masu.
Some may object to the one-time reward system for creators that AI-coustics is developing. Given that the models the company is training can be very lucrative in the long run. (There is a healthy debate about whether the creators of training data for AI models deserve residuals for their contributions.) But perhaps the bigger, more pressing concern is bias.
It is well established that speech recognition algorithms can create biases, biases that ultimately harm users. A study published in the Proceedings of the National Academy of Sciences showed that speech recognition at a major company was twice as likely to incorrectly transcribe the audio of a black speaker compared to a white speaker. .
To combat this, Seipel said AI Acoustics is focused on recruiting “diverse” audio sample contributors. He added: “Scale and diversity are key to eliminating bias and making technology compatible with all languages, speaker identities, ages, accents and genders.”
It wasn't the most scientific test, but I uploaded three video clips (an interview with an 18th century farmer, a car driving demonstration, and an Israeli-Palestinian conflict protest) to the AI-coustics platform. to see how well each performs in our tests. . AI acoustics certainly delivered on its promise of increased clarity. To my ears, the processed clip has far less ambient background noise drowning out the speakers.
Here are some previous 18th century peasant clips:
onwards:
Seipel said AI-coustics' technology could be used to enhance real-time and recorded audio, and perhaps even be incorporated into devices such as soundbars, smartphones, and headphones to automatically increase audio clarity. That's what I think. Currently, AI-coustics offers web apps and APIs for post-processing audio and video recordings, as well as his SDK that integrates AI-coustics' platform into existing workflows, apps, and hardware.
Seipel said AI-coustics makes money through a combination of subscriptions, on-demand pricing and licensing, and currently has five enterprise customers and 20,000 users (though not all of them are paying). . The roadmap for the coming months includes expanding his four-person team at the company and improving the underlying voice enhancement model.
“Prior to our initial investment, AI-coustics had a fairly lean operation with a low burn rate to weather the challenges of the VC investment market,” Seipel said. “AI-coustics currently has a strong network of investors and mentors seeking advice in Germany and the UK. Due to our strong technology foundation and ability to serve different markets using the same database and core technology , the company will be able to achieve flexibility and smaller pivots.”
When asked whether audio mastering technologies like AI acoustics will take away jobs, as some experts fear, Seipel said the number of hours currently held by human audio engineers will be reduced. He pointed out the potential of AI audio to streamline such work.
“Content production studios and broadcast managers can save time and money by using AI acoustics to automate parts of the audio production process while maintaining the highest audio quality,” he said. Ta. “Audio quality and clarity remains a vexing issue, not just for content production and consumption, but for nearly all consumer and professional devices. You may benefit.”
The funding came in the form of equity and debt tranches from Connect Ventures, Inovia Capital, FOV Ventures, and Ableton CFO Jan Bohl.