A relatively new startup called EvolutionaryScale has secured a large amount of funding to build an AI model that can generate new proteins for scientific research.
Evolutionscale announced Tuesday that it had raised $142 million in a seed round led by former GitHub CEO Nat Friedman, Daniel Gross, and Lux Capital, with participation from Amazon and Nvidia's corporate venture arm NVentures. The startup also released ESM3, an AI model that it describes as a “state-of-the-art model” of biology that can create proteins for use in drug discovery and materials science.
“ESM3 marks a step towards a future of biology where AI gives us the tools to design from first principles, just as we design structures, machines, microchips, and write computer programs,” Alexander Rives, co-founder and chief scientist at Evolutionscale, said in a statement.
Rives, along with Tom Sekle and Sal Candido, began developing generative AI models for protein discovery while at Meta's AI research lab FAIR in 2019. After the team disbanded, Rives, Sekle and Candido left Meta to continue the work they started.
Characterizing proteins can shed light on disease mechanisms, including how to slow or reverse disease progression, while creating proteins can lead to entirely new classes of drugs, tools and therapeutics. However, the current process of designing proteins in the lab is costly in terms of both computational and human resources.
To design a protein, you need to come up with a structure that might play a role in the body or in a product, and then find a protein sequence (the sequence of amino acids that make up the protein) that could “fold” into that structure. The protein must fold correctly into its three-dimensional shape to perform its intended function.
Trained on a dataset of 2.78 billion proteins, ESM3 can “infer” protein sequence, structure and function, and the model can generate new proteins, like Google DeepMind's AlphaFold, Rives says. EvolutionaryScale has made the full 98 billion parameter model available for non-commercial use through its cloud Forge developer platform, and has also released a scaled-down version of the model for offline use.
EvolutionaryScale claims to have used ESM3 to generate new variants of the green fluorescent protein (GFP), which is responsible for the glow in jellyfish and the glowing colors in corals. A preprint paper on the company's website details the work.
Fluorescent protein “esmGFP” created with EvolutionaryScale's ESM3. Image courtesy of EvolutionaryScale. Image courtesy of EvolutionaryScale.
“We've been working on this research for a long time, and we're excited to share it with the scientific community and see how they use it,” Rives said.
Of course, EvolutionaryScale is no charity. The company, which has about 20 employees, told TechCrunch it plans to make money through a combination of partnerships, royalties and revenue sharing. EvolutionaryScale might work with pharmaceutical companies to integrate ESM3 into their workflows, for example, or share revenue with researchers for any breakthrough discoveries commercialized using ESM3.
To this end, EvolutionaryScale announced that it will soon make ESM3 and its derivatives available to select AWS customers through the cloud provider's SageMaker AI development platform, Bedrock AI platform, and HealthOmics service. ESM3 will also be available to select customers who use NVIDIA's NIM microservices, which are supported by Nvidia enterprise software licenses.
EvolutionaryScale says that both AWS and Nvidia customers will be able to fine-tune ESM3 using their own data.
It may still be a while before EvolutionaryScale turns a profit; a company presentation obtained by Forbes last August repeatedly stressed that it could be 10 years before generative AI models are useful for designing treatments. The company also needs to fend off competition from DeepMind spinoff Isomorphic Labs, which already has deals with big pharma, as well as Insitro and publicly traded companies Recursion and Inceptive.
EvolutionaryScale's big bet is to expand model training to include data other than proteins and create general-purpose AI models for biotech applications.
“The incredible pace of progress in new AI is driven by ever-larger models, ever-larger datasets, and increasing computational power,” said an EvolutionaryScale spokesperson. “The same is true in biology. Over the past five years of research, the ESM team has studied scaling in biology. We find that as language models scale, we advance our understanding of underlying principles of biology and discover biological structure and function.”
All of this sounds like very ambitious plans to a reporter, but having deep-pocketed investors would certainly help.