WekaIO Received Certification for Nvidia DGX BasePOD Reference Architecture Built on Nvidia DGX H100 Systems and Weka Data Platform

WekaIO, Inc. received certification for an Nvidia DGX BasePOD reference architecture built on Nvidia DGX H100 systems and the Weka Data Platform.

This rack-dense architecture delivers massive data storage throughput starting at 600GB/s and 22 million IO/s in 8 rack units to optimize the DGX H100 systems.

The Data Platform provides the critical data infrastructure foundation required to support next-gen, performance-intensive workloads like generative AI model training and inference at scale. Its advanced, software-based architecture transforms stagnant storage silos into dynamic data pipelines that help to more efficiently fuel data-starved GPUs and power AI workloads on-premises, in the cloud, at the edge, or in hybrid and multicloud environments.

With its modular design, the DGX BasePOD offers the flexibility needed to scale resources according to evolving computational needs, providing cost-efficiency and streamlined management. Integrating Nvidia H100 Tensor Core GPUs and Nvidia IB networking technologies delivers enhanced performance for diverse AI workloads, fostering faster model training and deployment. It is designed to focus on agility, efficiency, and performance, making it a robust solution for organizations that want to optimize their data infrastructure investments while pushing the boundaries of AI innovation.

The Weka with Nvidia DGX BasePOD reference architecture solution efficiently delivers the performance enterprises customers need to accelerate AI adoption and achieve faster time to insights, discoveries, and outcomes.

Key benefits of Weka with Nvidia DGX BasePOD reference architecture include:

Performance for demanding AI workloads: Delivers 10x the bandwidth and 6x more IO/s than the previous Weka with Nvidia DGX BasePOD configuration based on Nvidia DGX A100 systems.
Best-of-breed compute: Nvidia DGX systems feature Xeon processors, Nvidia ConnectX-7 NICs, Nvidia Quantum-2 InfiniBand switches, and Nvidia Spectrum Ethernet switches.
Optimal efficiency: The rack-dense configuration delivers the performance needed to meet the needs of up to 16 DGX H100 systems in a space- and energy-efficient footprint – and is expected to support larger clusters of 32 or more DGX H100 systems.
Excellent linear scaling: Based on Nvidia’s validation testing results for various demanding AI/ML workloads, Weka’s integration with the architecture of the DGX BasePOD helps organizations start small and then quickly and independently scale up compute and storage resources from a single DGX to multi-rack configurations with predictable performance to flexibly meet workload requirements.
Turnkey choice and flexibility: Enterprise customers can use Weka Data Platform software with DGX BasePOD, powered by the latest DGX systems, to drive technologies and gain a time-to-market advantage.

Weka’s DGX BasePOD certification advances its journey to DGX SuperPOD certification.The company was among the 1st one to implement, qualify, and use Nvidia GPUDirect Storage (GDS), was one of the 1st DGX BasePOD-certified data stores in 2021, and assisted Nvidia with scaling its networking architectures to expand its enterprise customer footprint.

“Weka is proud to have achieved this important milestone with Nvidia. With our DGX BasePOD certification completed, our DGX SuperPOD certification is now in progress,” said Nilesh Patel, CPO, Weka. “With that will come an exciting new deployment option for Weka Data Platform customers. Watch this space.”

“Enterprises everywhere are embracing AI to enrich customer experiences and drive better business outcomes,” said Tony Paikeday, senior director, AI systems, Nvidia Corp. “With the Nvidia DGX BasePOD certification, Weka can help enterprises streamline their AI initiatives with optimized, high-performance infrastructure solutions that deliver data-fueled insights sooner.”

Resource:
Scaling Deep Learning with Weka and Nvidia DGX A100 Basepod reference architecture