Ampere and Qualcomm aren't the most obvious partners. After all, both offer Arm-based chips to run data center servers (although Qualcomm's biggest market remains mobile). But as the companies announced today, they're joining forces to deliver an AI-focused server that uses Ampere's CPUs and Qualcomm's Cloud AI100 Ultra AI inference chips to run models rather than train them. .
Like other chipmakers, Ampere is also looking to profit from the AI boom. However, the company has always focused on fast, power-efficient server chips, so while Arm IP can be used to add some of these features to the chip, it's not necessarily a core competency. . That's why Ampere decided to work with his Qualcomm (and SuperMicro to integrate the two solutions), says Arm CTO Jeff Wittich.
“The idea here is to show you the great performance of Ampere CPUs running AI inference solely on the CPU, but like any other model, scale to even larger models (e.g., hundreds of billions of parameter models). When it comes to other workloads, AI is not a panacea,” Wittich told TechCrunch. “We've been working with Qualcomm on this solution. We're combining our super-efficient Ampere CPUs to do a lot of the general-purpose tasks that we're running in conjunction with inference, and using very efficient cards. It's a server-level solution.
Regarding the partnership with Qualcomm, Witich said Ampere wants to bring together the best solutions.
“[R]”We've had a very good working relationship with Qualcomm here,” he said. “This is one of the things that we've been working on, and I think we share a lot of very similar interests. That's why I think this is really compelling. They , we're building really efficient solutions and different parts of the market. We're building really efficient solutions on the server CPU side.”
The partnership with Qualcomm is part of Ampere's annual roadmap update. Part of that roadmap is the new 256-core AmpereOne chip built using his latest 3nm process. These new chips are not yet generally available, but Witich says they are factory ready and expected to be rolled out later this year.
In addition to the additional cores, a distinctive feature of this new generation of AmpereOne chips is 12-channel DDR5 RAM, which allows Ampere's data center customers to better tailor users' memory access according to their needs. .
However, the selling point here is not just performance, but the power consumption and cost of running these chips in data centers. This is especially true when it comes to AI inference, and Ampere likes to compare its performance to his A10 GPU from Nvidia.
It's worth noting that Ampere has no intention of discontinuing existing chips in favor of these new chips. Wittich emphasized that there are still many use cases for these older chips.
Ampere also announced another partnership today. The company is working with NETINT to build a joint solution that combines Ampere's CPUs with NETINT's video processing chips. The new server will be able to transcode 360 live video channels in parallel while subtitling 40 streams using OpenAI's Whisper speech-to-text model.
“We started down this path six years ago because it was clear that it was the right path,” Ampere CEO Renee James said in today's announcement. “Low power consumption used to be synonymous with low performance. Ampere has proven that is not true. It has provided performance that exceeds that of CPUs.