Panmnesia Breaks Limitation of GPU Memory Capacity with World’s First 2-Digit Nanosecond Latency CXL Controller
Revealed CXL controller featuring CXL ASIC IP, which exhibits shorter round-trip latency than any other reported values.
This is a Press Release edited by StorageNewsletter.com on July 1, 2024 at 2:01 pm- Panmnesia, IP provider, revealed a CXL controller featuring its own CXL ASIC IP, which exhibits shorter round-trip latency than any other reported values.
- Firm’s low-latency CXL IP would be a key component for the ideal memory expansion.
- Panmnesia introduces the use case of the CXL IP: a storage-based expansion system that can overcome the limitation of GPU memory capacity.
Panmnesia, Inc. revealed world’s first 2-digit nanosecond latency CXL Controller; featuring its own CXL ASIC IP The latency is shorter than any other reported values.
Panmnesia’s XL IP could be applied to various system devices to realize cost-efficient memory expansion with high performance. As a usage example of the CXL IP, the company introduced a storage expansion solution named XLGPU; that can suppy terabytes of memory to the GPU. The firm will unveil this technology at USENIX federated conferences and ACM HotStorage, held in Santa Clara, CA, in July.
CXL: Key for cost-efficient memory expansion
The memory requirement in data centers increases as large-scale applications such as large language models (LLM) become more common in our lives. To expand memory capacity cost efficiently while maintaining reasonable performance, many big tech companies are focusing on an interconnect technology called Compute Express Link (CXL).
With CXL, it is possible to construct a scalable, integrated memory space by connecting multiple system devices. Since memory resources or computational resources can be added independently based on demand, CXL enables memory expansion with minimized TCO. Furthermore, users do not need to manually manage the integrated memory space, as the CXL controller handles a set of memory management operations (e.g., cache coherence management) in a hardware-automated manner.
Panmnesia’s CXL controller with 2-digit nanosecond latency: key component for efficient CXL system
Although CXL appears to be a promising solution for efficient memory expansion, one challenge remains to meet customers’ expectations: latency. Customers expect memory expansion without a significant performance drop compared to local memory access, making it crucial to minimize the additional latency caused by memory expansion.
Recently, the company developed a memory expansion technology that meets the customer’s demand for low latency. The firm has developed and completed the silicon manufacturing process of its CXL Controller IP, which executes all of CXL’s communication operations with low latency. A CXL Controller featuring the company’s CXL IP exhibits the 2-digit nanosecond round-trip latency, which is shorter than any other reported values.
A Panmnesia representative stated: “It was able to achieve such low latency by fully optimizing the controller’s operations across all relevant layers, including the physical layer, link layer, and transaction layer.”
The CXL controller IP developed by Panmnesia can be applied to a range of system devices, such as CPUs, switches, accelerators, and memory expanders, to automate and accelerate a set of memory management operations. By doing this, customers will be able to realize their ideal memory expansion, which reduces TCO with minimal performance degradation.
CXL-GPU: Application of Panmnesia’s high-speed CXL controller IP
While latency optimization is a common requirement across various industries, it is particularly important for companies providing Al services, such as LLM and recommendation systems. This is because latency directly impacts user satisfaction and revenue generation of Al services.
As an example usage of the low-latency CXL Controller IP, the company introduces ‘CXL-GPU,’ a GPU storage expansion solution that can increase the memory capacity of the GPU while optimizing the latency of Al service. This solution constructs a terabyte-scale memory space by connecting multiple storage devices via CXL and integrates it into the GPU memory space. Data and model parameters for large-scale Al services can be stored in this integrated memory space, allowing the GPU to access and process this data for Al services. Company’s low-latency CXL controllers handle memory management operations while being deployed in both GPU and storage devices.
Unlike the traditional method of purchasing multiple high-end GPUs to secure memory capacity, the firm’s CXL-based storage expansion solution minimizes the server construction costs for Al services since it enables the selective addition of memory resources based on demands. Furthermore, the firm’s low-latency CXL controller minimizes the overhead caused by memory expansion, thereby maintaining reasonable performance.
Panmnesia has built the hardware prototype of CXL-GPU by integrating their CXL Controller IP with custom GPU, which has been developed based on an open-source framework. Evaluation results on this prototype show that CXL-GPU solution outperforms traditional GPU memory expansion solution by 3.23x.
“The GPU storage expansion solution based on our CXL controller IP would help Al-service providers to significantly reduce their server construction cost,” said Dr. Myoungsoo Jung, CEO. “While we introduced the solution to expand GPU memory capacity, our low-latency CXL controller IP is also a key component for efficient memory expansion in cloud computing and HPC.”
CXL-GPU, Panmnesia’s GPU storage expansion solution using their CXL controller IP, will be unveiled at USENIX federated conferences and ACM HotStorage, held in Santa Clara, CA, in July.
(1) CXL IP is a product that implements CXL interface functions into circuit blocks so that various system devices can adopt CXL technology.
Resources:
Technical report of CXL-GPU
CXL IP product information (Design & Reuse)