What are you looking for ?
Advertise with us
Advertise with us

Supermicro Liquid-Cooled SuperClusters for AI Data Centers

Powered by Nvidia GB200 NVL72 and HGX B200 systems to deliver energy-efficient exascale computing

Supermicro, Inc. is accelerating the industry’s transition to liquid-cooled data centers with the Nvidia Blackwell platform to deliver a new paradigm of energy-efficiency for the rapidly heightened energy demand of new AI infrastructures.

Supermicro Nvidia Gb200 Nvl72 Rack Scale ConfigurationThe company’s end-to-end liquid-cooling solutions are powered by the Nvidia GB200 NVL72 platform for exascale computing in a single rack and have started sampling to select customers for full-scale production in late 4Q24. In addition, the recently announced the firm’s X14 and H14 4U liquid-cooled systems and 10U air-cooled systems are production-ready for the Nvidia HGX B200 8-GPU system.

We’re driving the future of sustainable AI computing, and our liquid-cooled AI solutions are rapidly being adopted by some of the most ambitious AI Infrastructure projects in the world with over 2000 liquid-cooled racks shipped since June 2024,” said Charles Liang, president and CEO, Supermicro. “Supermicro’s end-to-end liquid-cooling solution, with the Nvidia Blackwell platform, unlocks the computational power, cost-effectiveness, and energy-efficiency of the next generation of GPUs, such as those that are part of the Nvidia GB200 NVL72, an exascale computer contained in a single rack. Supermicro’s extensive experience in deploying liquid-cooled AI infrastructure, along with comprehensive on-site services, management software, and global manufacturing capacity, provides customers a distinct advantage in transforming data centers with the most powerful and sustainable AI solutions.” 

SuperCluster 9 rack

Supermicro Supercluster 9rack 32x Hgx H100 H200 8gpu 8u Air Cooled

The company‘s liquid-cooled SuperClusters for Nvidia GB200 NVL72 platform-based systems feature the advanced in-rack or in-row coolant distribution units (CDUs), and custom cold plates designed for the compute tray housing 2 Nvidia GB200 Grace Blackwell Superchips in a 1U form factor. Supermicro’s Nvidia GB200 NVL72 delivers exascale AI computing capabilities in a single rack with the company‘s end-to-end liquid-cooling solution.

The rack solution incorporates 72 Nvidia Blackwell GPUs and 32 Nvidia Grace CPUs, interconnected by Nvidia’s 5th Gen NVLink network. The Nvidia NVLink Switch system facilitates 130TB/s of total GPU communication with extremely low latency, enhancing performance for AI and HPC workloads. In addition, the firm supports recently announced Nvidia GB200 NVL2 platform, 2U air-cooled system featuring tightly coupled 2 Nvidia Blackwell GPUs and 2 Nvidia Grace CPUs that is suited for deployment with diverse workloads such as large LLM inference, RAG, data processing, and HPC applications.

Click to enlarge

Supermicro 4u 10u Cooled Rack ConfirugationsThe company’s 4U liquid-cooled systems and the 10U air-cooled systems support the Nvidia HGX B200 8-GPU system and are ready for production. The newly developed cold plates and the 250kW capacity in-rack coolant distribution unit maximize the performance and efficiency of the 8-GPU systems, providing 64x 1,000W Nvidia Blackwell GPUs and 16x 500W CPUs in a single 48U rack. Up to 4 of the 10U air-cooled systems can be installed and fully integrated in a rack, the same density as the previous-gen, while providing up to 15x inference and 3x training performance.

Supermicro Supercloud Composer Software, SchemeSuperCloud Composer software, Supermicro’s data center management platform, provides tools to monitor vital information on liquid-cooled systems and racks, coolant distribution units, and cooling towers, including pressure, humidity, pump and valve conditions, and more. SuperCloud Composer’s Liquid Cooling Consult Module (LCCM) optimizes the operational cost and manages the integrity of liquid-cooled data centers.

Scaling the infrastructure for multi-trillion parameter AI models, Supermicro is at the forefront of adopting networking innovations for both IB and Ethernet, including Nvidia BlueField-3 SuperNICs and ConnectX-7 at 400Gb, ConnectX-8, Spectrum-4, and Quantum-3 to enable 800Gb networking for the Nvidia Blackwell platform. The Nvidia Spectrum-X Ethernet with Supermicro’s 4U liquid-cooled and 8U air-cooled Nvidia HGX H100 and H200 system clusters now powers one of the largest AI deployments to date. 

From proof-of-concept (PoC) to full-scale deployment, the company is a one-stop shop, providing all necessary technologies, liquid-cooling, networking solutions, and onsite installation services. The firm delivers an in-house-designed liquid-cooling ecosystem, encompassing custom-designed cold plates optimized for various GPUs, CPUs, and memory modules, along with multiple CDU form factors and capacity, manifolds, hoses, connectors, cooling towers, and monitoring and management software. This end-to-end solution integrates into rack-level configurations, boosting system efficiency, mitigating thermal throttling, and simultaneously reducing both the TCO and environmental impact of data center operations for the era of AI.

Articles_bottom
ExaGrid
AIC
ATTOtarget="_blank"
OPEN-E