R&D: Building Stream-based Page Cache to Accelerate File Scanning on Fast Storage Devices
Authors propose StreamCache, new page cache management system for file scanning on fast storage devices
This is a Press Release edited by StorageNewsletter.com on March 28, 2025 at 2:00 pmACM Transactions on Storage has published an article written by Zhiyue li, Tsinghua University, Beijing, China, and Guangyan Zhang, Dept. Computer science and technology, Tsinghua University, Beijing, China.
Abstract: “Buffered I/O via page cache is used for file scanning in many cases as page cache can provide buffering, data aggregation, I/O alignment, and prefetching transparently. However, our study indicates that employing page cache for file scanning on fast storage devices presents two performance issues: it offers limited I/O bandwidth that does not align with the performance of fast storage devices, and the intensive background writeback onto fast storage devices can significantly interfere with foreground I/O requests.“
“In this paper, we propose StreamCache, a new page cache management system for file scanning on fast storage devices. StreamCache exploits three techniques to achieve high I/O performance. First, it uses a two-layer memory management method to accelerate page allocation by leveraging CPU cache locality. Second, it uses a stream-based page reclaiming method to lower the interference to foreground I/O requests. Third, it uses a lightweight stream tracking method to record the states of cached pages at the granularity of sequential streams to support stream-based page reclaiming.“
“We implement StreamCache in XFS. Experimental results show that compared with existing methods, StreamCache can increase the I/O bandwidth of scientific applications by 44%, and reduce the checkpoint/restart time of large language models by 15.7% on average.“