A few months ago, OpenAI captivated the tech world with Sora, a generative AI model that transforms scene descriptions into original videos without the need for cameras or film crews. But Sora has been severely limited so far, and the company appears to be targeting deep-pocketed creators like Hollywood directors rather than hobbyists or small marketers.
Alex Mashrabov, former head of generative AI at Snap, saw an opportunity. So he launched his Higgsfield AI, an AI-powered video creation and editing platform designed for more customized and personalized applications.
Higgsfield's first app to leverage a custom text-to-video model, Diffuse lets you generate videos from scratch or take a selfie and generate a clip featuring that person.
“Our target audience is creators of all types,” Mashrabov said in an interview with TechCrunch. To stand out. ”
Mashrabov joins Snap from his previous startup, AI Factory, which Snap acquired for $166 million in 2020. During his time at Snap, Mashrabov helped build products such as his AR effects and filters for Snapchat, including Cameo, and even his controversial MyAI Chabot for Snapchat.
Higgsfield, which Mashrabov launched a few months ago in collaboration with Yerzat Dulat, an AI researcher who specializes in generated video, is a curated set of pre-generated clips, reference media (i.e. images and ), and a prompt editor. Users can describe the characters, actions, and scenes they want to draw. Diffuse allows users to insert themselves directly into scenes generated by his AI or to imitate things captured in other videos (such as dance moves) with a digital caricature.
“Our model supports very realistic movement and expression,” says Mashrabov. “We are pioneering a consumer-facing ‘world model’ that will allow us to build best-in-class video generation and editing with a greater level of control.”
Higgsfield isn't the only generative video startup going head-to-head with OpenAI. Although Runway was first on the scene, he is one of them, and his tools continue to improve. There's also Haiper, a company backed by two DeepMind alumni and his more than $13 million in venture cash.
Mashrabov argues that Diffuse will stand out because of its mobile-first and social-forward go-to-market strategy.
“By prioritizing iOS and Android apps over desktop workflows, we empower creators to create engaging social media content anytime, anywhere,” said Mashrabov. “Certainly, building on mobile allows us to prioritize ease of use and consumer-friendly features from day one.”
Higgsfield is also running efficiently. According to Mashrabov, the generative model behind the platform was developed within nine months by his team of 16 people and trained on his cluster of 32 GPUs. (Three-two GPUs may sound like a lot, but it's not, considering OpenAI uses tens of thousands.) And Higgsfield has earned $8 million to date. The company has raised only a small amount of money, the majority of which came from a recent seed funding tranche led by Menlo Ventures.
To stay ahead of competitors, Higgsfield is building an improved video editor that allows users to change characters and objects in videos, as well as a more powerful video editor specifically for social media use cases. We plan to inject a seed cache into training the video generation model. In fact, Mashrabov considers social media, and social media marketing for him, Higgsfield's primary money-making niche.
Although Diffuse is currently free to use, Mashrabov envisions a future where marketers pay some type of fee or subscription for premium features and high-volume or large-scale campaigns.
“We believe Higgsfield unlocks an incredible level of realism and content creation use cases for social media marketers,” he said. “We constantly hear from CMOs and creative directors that they need to optimize content production budgets and shorten schedules while delivering impactful content. So our video generation AI solution helps them achieve that. We believe it will become a core solution.”
Of course, Higgsfield is not immune to the broader challenges facing generative AI startups.
It's well established that generative AI models like those that power Diffuse can “regurgitate” training data. Why does that matter? Well, if models are trained on copyrighted content without permission or some kind of licensing agreement, users of those models can unknowingly produce copyright-infringing works and face lawsuits. You may be exposed.
Mashrabov would not reveal the source of Higgsfield's training data (other than to say it comes from “multiple publicly available” locations), nor did he say that Higgsfield will train future models. It also did not say whether it would retain user data to do so. Business customers. He also noted that Diffuse users can request deletion of their data at any time through the app.
As the wild spread of deepfakes on social media in recent months has shown, digital “cloning” platforms like Higgsfield are also ripe for abuse.
Similarly, Higgsfield could make it easier to steal creators' content. For example, you can simply upload a video of someone else's choreography and generate a video of yourself performing the same choreography.
When I asked Mashrabov what safeguards and safeguards Higgsfield has in place to prevent abuse, he declined to go into detail, but said the platform does have automated and manual moderation. He claimed that they are being used in combination.
“We have decided to roll out the product in stages and test it in select markets first so that we can monitor for potential exploits and evolve the product as needed. ” added Mashrabov.
We'll have to wait and see how well it actually works.