Google Cloud Next will be held in Las Vegas this week. That means it's time for many new instance types and accelerators to come to Google Cloud Platform. In addition to his new custom Arm-based Axion chip, most of this year's announcements are about AI accelerators, whether made by Google or Nvidia.
Just a few weeks ago, Nvidia announced the Blackwell platform. But don't expect Google to offer these machines any time soon. Support for the high-performance Nvidia HGX B200 for AI and HPC workloads and GB200 NBL72 for large-scale language model (LLM) training is expected to be available in early 2025. His one interesting takeaway from Google's announcement is that the GB200 servers will be water-cooled.
This may seem like a bit of a premature announcement, but Nvidia has said that its Blackwell chips won't be available to the public until the final quarter of this year.
Before Blackwell
Today, for developers who need more power to train their LLMs, Google also announced A3 Mega instances. This instance, developed by the company in collaboration with his Nvidia, features the industry-standard H100 GPU, but combines it with a new networking system that can deliver up to twice the bandwidth per GPU.
Another new A3 instance is A3 Confidential. Google explains that this allows customers to “better protect the confidentiality and integrity of sensitive data and AI workloads during training and inference.” The company has long offered a confidential computing service that encrypts data in use, and now, with confidential computing enabled, Intel CPUs and Nvidia H100 Data transfer between GPUs is encrypted. Google says no code changes are required.
As for Google's own chips, the company on Tuesday made generally available the most powerful of its AI accelerators, the Cloud TPU v5p processor. These chips deliver 2x more floating point operations per second and 3x faster memory bandwidth speeds.
All of these high-speed chips require an underlying architecture to support them. So in addition to new chips, Google on Tuesday also announced new AI-optimized storage options. Google says Hyperdisk ML, currently in preview, is its next-generation block storage service that can improve model load times by up to 3.7x.
Google Cloud is also launching a number of more traditional instances powered by Intel's 4th and 5th generation Xeon processors. For example, the new general-purpose C4 and N4 instances are powered by 5th generation Emerald Rapids Xeon, with C4 focused on performance and N4 focused on price. The new C4 instances are currently in private preview, and the N4 machines are generally available starting today.
Also in preview, are C3 bare metal machines with older 4th generation Intel Xeons, X4 memory-optimized bare metal instances (also in preview), and Z3, Google Cloud's first storage-optimized virtual machine. It's still in preview stage. It promises to provide “the highest IOPS for storage-optimized instances among the major clouds.”