Lightspeed Ventures-backed audio platform Pocket FM announced it has partnered with audio duplication company ElevenLabs to use AI to quickly convert scripts and other text content into audio series.
Pocket FM, which raised $103 million in Series D funding in March, told TechCrunch at the time that it was already experimenting with a feature to convert text content into speech using ElevenLabs' technology. Now, the India-based company is expanding its partnership and plans to make the conversion tool available to all creators in the coming weeks.
In its testing phase, Pocket FM has already used Eleven Labs' AI technology to produce 30,000 hours of audio series. With the new rollout, the startup plans to triple its content library to over 100,000 hours of audio content this year. Pocket FM also said that in its experimental phase, the AI-powered tool has enabled it to reduce its audio production costs by 90%.
Image courtesy of Pocket FMImage courtesy of Pocket FM
Pocket FM co-founder and CTO Prateek Dixit said on a conference call with TechCrunch that the partnership aims to make it easier for writers to turn their writing into audio series.
“We have over 250,000 writers (including those on our Pocket Novel writing platform) and this partnership will reduce the setup and audio recording costs for our writers,” he said.
“Even with the right recording tools and equipment set up, a writer can produce roughly 30 minutes of high-quality audio content per day. With AI tools, this output can increase tenfold,” he added.
Pocket FM has developed a tool that integrates ElevenLabs technology to provide 50 voices to writers who want to transform their content. ElevenLabs co-founder Mati Staniszewski said the company's tool understands the context of a sentence and automatically infers sentiment through the voice.
“Together with Pocket FM, we are rolling out a new model that understands the genre of the text and better understands sentiment,” Staniszewski said.
Dixit said the platform also plans to suggest suitable voices for writers in particular genres, based on data gathered from users' engagement with this type of content.
Pocket FM isn't the only audio series platform experimenting with AI-powered tools: Google-backed Kuku FM uses GPT-4, Claude, BandLab, and even ElevenLabs to help writers with different stages of production, including refining scripts, generating thumbnails, adding sound effects, and converting text to audio.
Kuku FM told TechCrunch it is also experimenting with using visual generation tools such as MidJourney and Runway to create ads relevant to its content.
Quality of content and impact on artists
AI-powered tools promise to generate more content faster, but that doesn't mean the content is good. Pocket FM is refining its discovery algorithms and experimenting with user engagement to aid discovery and surface quality content.
“If an author publishes an audio series, we expose that content to a particular number of users and observe the engagement metrics. If these metrics are positive, we promote it further,” Dixit said.
Leveraging AI could help these platforms get faster results and expand their content libraries, but at the same time, it will also reduce the role of narrators working on these platforms.The Narrators Association of India (AVA) has expressed concerns over AI domination.
“Once AI takes over, we are finished. As voice actors, we need some kind of regulation in place to ensure our livelihood is protected,” the association's executive director Amarinder Singh Sodhi told India's Scroll magazine.
Sodhi also told Scroll about an incident in which a voice-over artist was called into a studio to record samples to train an AI without their consent or knowledge.
“On an emotional level, it's scary. Using AI essentially dilutes the human experience of storytelling. You lose the emotional connect,” Aditya Mattoo, a Delhi-based narrator, told TechCrunch.
He added that giving access to premium audio to people who don't have the flair and skills to produce quality content will result in the market being flooded with low-quality content.
When asked about the impact of AI voice generation on Pocket FM, the company did not respond directly. However, Dixit said that in experiments, reactions to AI-generated content are “comparable to human narration production.” Notably, the company is also working on technology to incorporate multiple voices into a single audio output.
Pocket FM and Kuku FM currently do not label their content to indicate whether AI has been used in the production process.