AI Company Sesame has released a basic model that promotes Maya, an impressive and realistic voice assistant.
Models with a size of 1 billion parameters (“parameters” that refer to individual components of the model) are under the Apache 2.0 license. This means it can be used commercially with little restrictions. Called the CSM-1B, this model generates “RVQ audio codes” from text and audio inputs, as explained in SESAME on the AI Dev platform.
RVQ refers to “residual vector quantization,” a technique for encoding audio into discrete tokens called code. RVQ is used in many recent AI audio technologies, such as Google's SoundStream and Meta's Encodec.
The CSM-1B uses a model from the Meta Lama family and combines it with the backbone and audio “decoder” components. The tweaked variant of CSM Powers Maya says Sesame.
“The open sourced model here is a base-generated model,” Sesami wrote in CSM-1B's embracing face and in his GitHub repository. “It can produce a variety of voices, but it hasn't been fine-tuned with a specific voice […] This model has the ability of a language other than English due to data contamination of training data, but that probably won't work. ”
The data that Sesame used to train CSM-1B is unknown. The company didn't say it.
It is noteworthy that the model does not have any actual protection measures. Sesame has an honor system that encourages developers and users to use models to mimic people's voices without their consent, to create misleading content such as fake news, and to engage in “harmful” or “malicious” activities.
I hugged my face and tried the demo, but it took less than a minute to clone the voice. From there it was easy to generate speeches for my heart's desires, including controversial topics such as elections and Russian propaganda.
Consumer reports recently warned that many of the market's popular AI-powered voice cloning tools lack “meaning” protection measures to prevent fraud and abuse.
Co-founded by Oculus co-creator Brendan Iribe, Sesame went viral in late February for Assistant Tech. Miles, other assistants at Maya and Sesame, can breathe, speak and speak.
Sesame has raised private capital from Andreessen Horowitz, Spark Capital and Matrix Partners. In addition to building voice assistant technology, the company says it is prototyping for AI glasses “designed to be worn all day” with custom models.