ElevenLabs, a startup that provides AI voice cloning and text-to-speech APIs, on Monday launched the ability to build conversational AI bots.
The company announced that users can now build complete conversational agents on the Eleven Labs developer platform with customizable variables such as tone of voice and response length.
Eleven Labs has primarily been engaged in providing various voice and AI tools for text-to-speech services. Sam Sklar, the company's head of growth, told TechCrunch that many customers are already leveraging this capability to create conversational AI agents. However, the most difficult part was integrating the knowledge base and dealing with customer interruptions. So the company decided to build a complete pipeline for conversational bots.
Users can start building conversational agents by logging into their ElevenLab account and selecting a template or creating a new project. You can determine your agent's persona by choosing your agent's primary language, first message, and system prompts. Developers must also choose a large language model (Gemini, GPT, or Claude), the temperature of the response (to determine how creative the response should be), and token usage restrictions.
You can also adjust other aspects such as audio, latency, stability, authentication criteria, and maximum conversation time with the AI agent.
Users can enhance their conversational bots by adding their own knowledge bases such as files, URLs, and text blocks. Additionally, you can also integrate your own custom LLM with the bot. Eleven Labs SDK is compatible with Python, JavaScript, React, and Swift. The company also offers a WebSocket API for further customization.
Companies can also define criteria for collecting specific data items (such as the name or email of the customer speaking with an agent), along with natural language metrics to define success or failure of a call.
Eleven Lab leverages existing pipelines for the text-to-speech portion. The company needs to develop speech-to-text capabilities for new conversational AI products. Although the company does not currently offer a Speech-to-Text API as a standalone product, it may do so in the future and could become a competitor to Google, Microsoft, and Amazon's Speech-to-Text APIs. There is a gender. Also includes specialized APIs such as OpenAI's Whisper, AssemblyAI, Deepgram, Speechmatics, and Gladia.
The company, which is seeking new funding at a valuation of more than $3 billion, also competes with other voice AI startups such as Vapi and Retell, which are also developing conversational agents. More notably, the company competes with OpenAI's real-time conversation API. However, ElevenLab believes its customization and model switching capabilities give it an advantage over OpenAI.