Photonic computing startup Lightmatter has raised $400 million to blow away one of the bottlenecks of modern data centers. The company's optical interconnect layer enables hundreds of GPUs to work synchronously, streamlining the costly and complex task of training and running AI models.
The growth of AI and its massive computing requirements are intensifying the data center industry, but it's not as simple as plugging in another 1,000 GPUs. As high-performance computing experts have known for years, it doesn't matter how fast each node of a supercomputer is if it's idle half the time waiting to receive data. there is no.
The interconnect layer is actually what turns a rack of CPUs and GPUs into effectively one giant machine. Therefore, the faster the interconnect, the faster the data center. And it looks like Lightmatter will soon build the fastest interconnect layer using the photonic chips it has been developing since 2018.
“Hyperscalers know that if you want a computer with a million nodes, you can't do it with Cisco switches. Once you leave the rack, you move from high-density interconnects to basically powerful cups. ,” the company's CEO and founder Nick Harris told TechCrunch. (You can see a short talk by him summarizing the issue here.)
He said the cutting edge technology is NVLink, specifically the NVL72 platform, which allows 72 NVIDIA Blackwell units to be wired in a rack and deliver up to 1.4 exaflops with FP4 accuracy. But no rack is an island, and all that compute must be squeezed through a 7 terabit “scale-up” network. It sounds like a big deal, and it is. The inability to quickly network these units to each other or to other racks is one of the main barriers to improving performance.
“With a million GPUs, you need multiple layers of switches, which adds a significant latency burden,” Harris says. “You have to go from electricity to light, and from electricity to light…The amount of power you use and the standby time is huge. And it gets dramatically worse as the cluster gets bigger.”
So what does light matter bring? Fiber. Large amounts of fiber routed purely through optical interfaces. Up to 1.6 terabits per fiber (with multiple colors), up to 256 fibers per chip…well, let's just say 7 terabits and 72 GPUs starts to sound pretty old-fashioned.
“Photonics is happening much faster than people thought. People have been struggling for years to make it work, but we're getting there,” Harris said. I am. “After seven years of absolutely murderous hardship,” he added.
The photonic interconnect currently available from Lightmatter delivers 30 terabits, and optical cabling on the rack allows 1,024 GPUs to operate synchronously within a unique, specially designed rack. In case you're wondering, most things that need to be networked to another rack can be run on a rack with a 1,000 GPU cluster, so the two numbers aren't driven by similar factors. Not. (And anyway, 100 terabits is coming.)
Image credit: Light Matter
This market is huge, Harris noted, with every major data center company, from Microsoft to Amazon to new entrants like xAI and OpenAI, demonstrating an unbridled appetite for computing. “We're connecting buildings!'' I wonder how long we can maintain that,'' he said.
Many of these hyperscalers are already customers, but Harris declined to name them. “Think of Lightmatter like a foundry like TSMC,” he said. “We don't pick favorites or put our name on someone else's brand. We just give them a roadmap and a platform and help them grow their pie.”
However, he sheepishly added, “We cannot quadruple our valuation without leveraging this technology.” This is likely an allusion to OpenAI's recent funding round, which values the company at $157 billion, but the statement could easily be about his own company.
This $400 million D round would be valued at $4.4 billion, which is a similar multiple to our mid-2023 valuation, “making us by far the largest photonics company. So that's cool!” Ms. Harris said. The round was led by T. Rowe Price Associates, with participation from existing investors Fidelity Management and Research Company and GV.
What's next? In addition to interconnects, the company is developing new substrates for its chips that can use light to perform even more intimate networking tasks.
Harris speculated that, apart from interconnect, power per chip will be a major differentiator going forward. “In 10 years, everyone will have a wafer-scale chip. That's the only way to get more performance per chip,” he says. Cerebras is of course already working on this, but whether the true value of that advancement can be captured at this stage of the technology is an open question.
But Harris sees the chip industry hitting a wall and is prepared to wait for the next step. “Ten years from now, interconnectivity will be Moore's Law,” he said.