CoreWeave Cloud Provider: Availability of Nvidia GB200 NVL72 Instances

CoreWeave announced it is a first cloud provider to make Nvidia GB200 NVL72-based instances generally available.

CoreWeave’s GB200 NVL72-powered cluster is built on the Nvidia GB200 Grace Blackwell Superchip – bringing performance and scalability to the next level, empowering customers to rapidly train, deploy, and scale the world’s most complex AI models.

“Today’s milestone further solidifies our leadership position and ability to deliver cutting-edge technology faster and more efficiently,” said Brian Venturo, co-founder and chief strategy officer, CoreWeave. “Today’s launch is another achievement of our series of firsts, and represents a force multiplier for businesses to drive innovation while maintaining efficiency at scale. CoreWeave’s portfolio of cloud services – such as CoreWeave Kubernetes Service, Slurm on Kubernetes (SUNK), and our Observability platform – is purpose-built to make it easier for our customers to run, manage, and scale AI workloads on cutting-edge hardware. We’re eager to see how companies take their AI deployments to the next level with Nvidia GB200 NVL72-based instances on CoreWeave.”

The promise of next-GenAI powered by foundational and reasoning models is enormous. Scalability of cutting-edge models can be vastly constrained by server limitations – especially when it comes to memory capacity and communication speeds between GPUs. The CoreWeave GB200 NVL72 instances feature rack-level NVLink connectivity and Nvidia Quantum-2 InfiniBand networking delivering 400Gb/s bandwidth/GPU through a rail-optimized topology for clusters up to 110,000 GPUs. Leveraging Nvidia Quantum-2’s SHARP In-Network Computing technology, collective communication can be further optimized, resulting in ultra-low latency and accelerated training speeds. CoreWeave’s purpose-built, no-compromises approach to AI workloads, integrated with Nvidia’s world-class architecture, enables companies to harness the full power of the superchip efficiently, in a highly performant and reliable environment.

Specifically:

Up to 30X faster real-time large language model (LLM) inference compared to previous generations.
Up to 25X lower TCO and 25X less energy for real-time inference.
Up to 4X faster training of LLM models compared to previous-gen.

This latest development furthers CoreWeave’s journey as an industry leader in AI infrastructure. Last August, the company was among a first to offer Nvidia H200 GPUs to train the fastest GPT-3 LLM workloads. In November, it was one of a first to demo Nvidia GB200 systems in action. Earlier this month, the company announced it will deliver 1 of a first Nvidia GB200 Grace Blackwell Superchip-enabled AI supercomputers to IBM for training its next-gen of Granite models.

“Partnering with CoreWeave to access cutting-edge AI compute, including IBM Spectrum Scale Storage, to train our IBM Granite models demonstrates our commitment to advancing a hybrid cloud strategy for AI,” said Priya Nagpurkar, VP, hybrid cloud and AI platform research, IBM Corp. “As we continue to develop hybrid cloud and AI solutions, we are committed to delivering best-in-class innovations to our enterprise clients, from purpose-built Granite models, to advanced hybrid cloud platform and compute capabilities.“

“Scaling for inference and training is one of the largest challenges for organizations developing next generation AI workloads,” said Ian Buck, VP, Hyperscale and HPC, Nvidia Corp. “Nvidia is collaborating with CoreWeave to enable fast, efficient generative and agentic AI with the Nvidia GB200 Grace Blackwell Superchip to empower organizations of all sizes to push the boundaries of AI, reinvent their businesses and provide groundbreaking customer experiences.”