As generative AI permeates more and more industries, companies that make the chips that run the models are reaping huge profits. Nvidia, which is estimated to control 70% to 95% of the AI chip market, has particularly strong influence. Cloud providers from Meta to Microsoft, wary of falling behind in generative AI, are spending billions of dollars on Nvidia GPUs.
Generative AI vendors have understandable reasons to be unsatisfied with the status quo: Their success depends in large part on the whims of the major chipmakers, so they, along with opportunistic venture capitalists, are on the hunt for promising startups to challenge the incumbents of AI chips.
Etched is one of many alternative chip companies vying for position in this space, but it's also one of the most intriguing. Just two years old, Etched was founded by Harvard dropouts Gavin Uberti (ex-OctoML and ex-Xnor.ai) and Chris Zhu, who, along with Robert Wachen and former Cypress Semiconductor CTO Mark Ross, set out to build a chip that could do one thing and one thing only: run AI models.
This isn't unusual: Many startups and large tech companies have developed (or are developing) chips that only run AI models, also known as inference chips — Meta has MTIA, Amazon has Graviton and Inferentia, etc. — but Etched's chip is unique in that it only runs one type of model: Transformers.
Proposed by a team of Google researchers in 2017, Transformers are by far the dominant generative AI model architecture to date.
Transformers are the basis of OpenAI's video generation model Sora, they are at the core of text generation models such as Anthropic's Claude and Google's Gemini, and they are also used in generative art tools such as the latest version of Stable Diffusion.
“In 2022, we bet that Transformers will rule the world,” Etched CEO Uberti told TechCrunch in an interview. “We've reached a point in the evolution of AI where specialized chips that perform better than general-purpose GPUs are essential, and tech decision makers around the world know this.”
Etched's chip, “Sohu,” is an ASIC (application-specific integrated circuit), a chip customized for a specific application, in this case running a transformer. Manufactured using TSMC's 4nm process, Sohu delivers significantly better inference performance than GPUs and other general-purpose AI chips, while consuming less power, Uberti claims.
“Sohu is orders of magnitude faster and cheaper than Nvidia's next-generation Blackwell GB200 GPUs when running text, image and video transformers,” Uberti said. “One Sohu server is the equivalent of 160 H100 GPUs. … Sohu will be a more affordable, efficient and environmentally friendly option for business leaders who need specialized chips.”
How does Sohu achieve all this? There are a few ways, but the most obvious and intuitive is a streamlined inference hardware and software pipeline. Because Sohu does not run non-Transformer models, the Etched team was able to eliminate hardware components that are not related to Transformers, while simultaneously reducing the software overhead traditionally used to deploy and run non-Transformers.
Etched chart comparing hardware performance running Meta's open model Llama 70B. Image credit: Etched
Etched arrives at a tipping point in the race for generative AI infrastructure: Beyond cost concerns, the GPUs and other hardware components currently required to run large models are dangerously power-hungry.
Goldman Sachs predicts that AI will increase data center electricity demand by 160% by 2030, significantly increasing greenhouse gas emissions. Meanwhile, researchers at the University of California, Riverside estimate that global AI use could cause data centers to consume 1.1 trillion to 1.7 trillion gallons of fresh water by 2027, impacting local resources (many data centers use water to cool servers).
Uberti optimistically (or hyperbolically, depending on how you interpret it) pitches Sohu as a solution to the industry's consumption problems.
“The bottom line is that our future customers will have no choice but to switch to Sohu,” Uberti said. “Companies are betting on Etched because speed and cost are essential for the AI products they're building.”
But assuming Etched achieves its goal of bringing Sohu to the mass market within the next few months, can it succeed with so many other companies following close behind?
Etched doesn't have any direct competitors right now, but AI chip startup Perceive recently previewed a processor with hardware acceleration for Transformers, and Groq is also investing heavily in Transformer-specific optimizations for its ASICs.
Competition aside, what if one day Transformers fell out of favor? Uberti says that in that case, Etched would naturally design a new chip. That's reasonable, but it would be a pretty drastic step back given how long it took to perfect Sohu.
None of these concerns have deterred investors from pumping huge amounts of money into Etched.
Today, Etched announced it has closed a $120 million Series A funding round co-led by Primary Venture Partners and Positive Sum Ventures. This brings Etched's total funding to $125.36 million and included participation from big-name angel investors such as Peter Thiel (Uberti, Zhu, and Wachen are Thiel Fellowship alumni), GitHub CEO Thomas Dohmke, Cruise (and Bot Company) co-founder Kyle Vogt, and Quora co-founder Charlie Cheever.
These investors likely believe there's ample opportunity for Etched to expand its server sales business. And it probably will: Uberti claims that unnamed customers have reserved “tens of millions of dollars” worth of hardware so far. Uberti suggested that the upcoming release of Sohu Developer Cloud, which lets customers preview Sohu through an online interactive playground, should drive further sales.
But it seems too early to tell whether this will be enough to propel Etched and its 35-person team into the future the company's co-founders envision. The AI chip field can be unforgiving even in the best of times: Just look at the high-profile near-failures of AI chip startups like Mythic and Graphcore and the related plunge in funding for AI chip ventures in 2023.
But Uberti has a strong pitch: “Video generation, audio-to-audio conversion, robotics, and other future AI use cases wouldn't be possible without faster chips like Sohu. The entire future of AI technology will be shaped by whether our infrastructure can scale.”