New Storage Software Company Leil Storage Developing SaunaFS, Distributed File System Inspired by Google
New open source software targeting secondary storage
By Philippe Nicolas | April 8, 2024 at 2:02 pmFor their own usage, hyperscalers developed their own file system because they didn’t find commercial or other solutions on the market that can address their needs both in terms of scalability, throughput and resiliency with a very low cost.
After developing and using for several years multiple giant Google File System (GFS) clusters, Google created its successor, Colossus File System, with essentially distributed metadata servers storing metadata in BigTable with small chunk size of 1MB instead of 64 and able to scale 100x over the largest GFS clusters. You probably read the original GFS paper published in 2003 and also the page dedicated to Colossus.
Facebook unveiled their Tectonic File System beyond f4 and Haystack.
HDFS aka Hadoop Distributed File System also has been inspired by Google File System according to Doug Cutting, at Yahoo! at that time, and moved from the Apache Nutch project to the Hadoop one in 2006.
We also can list at least 10 products or projects coming all around the planet that have roots coming from these projects and especially by the Google File System.
We select a few various ones below:
- We have to speak about Ceph, a project started by Sage Weil in 2005 that became a reference in the domain for on-premises deployment, later OpenStack and also in the cloud. The file system was released a very long time after the initial product and it has now existed for a few years. Red Hat acquired by IBM in 2019 is a key active contributor to Ceph.
- GlusterFS was also launched in 2005 by Anand Babu Periasamy, more recently CEO and founder of MinIO, and Anand Avati, with the idea to democratize scalable storage with an open source approach. The product got a significant adoption and finally the company behind got acquired by Red Hat in 2011.
- The same year Fraunhofer Center for HPC in Germany started FhGFS and in 2014 the university spun-off ThinkParQ as a separate entity to continue the effort on the product that is renamed BeeGFS.
- In 2007, the Zuse Institute Berlin with funds from the European Commission started XtreemFS and it seems that the project is inactive for a few years now according to Github. In 2013, 2 key former developers Bjorn Kolbeck and Felix Hupfeld launched a commercial file system company named Quobyte.
- SeaweedFS is another project inspired by Facebook Haystack and appears to have been launched around 2011.
- We have to mention Lustre, initially designed at CMU in 1999 with a first release in 2003. The product is well recognized and adopted in HPC.
- RozoFS, developed by Rozo Systems in 2010 now an asset owned by Hammerspace, also designed an asymmetric distributed file system with a single metadata server, protected by DRBD, and a bunch of data servers. Each client fragments data and sends them to distinct servers thanks to the Mojette Transform, a specific erasure coding technique developed at the university of Nantes, France. In addition, Ganesha and Samba can be layered on each client to offer a scalable NAS solution. It is the case with many of the products covered in this article.
- Two other projects, MooseFS and then LizardFS, were launched in Poland more than 10 years ago. First MooseFS, with an initial release in 2008, took models from Ceph, Lustre and Google File System. LizardFS was forked and seems to be stopped now, MooseFS continues and the team will update it in a few months.
- JuiceFS was launched in 2017 by Juicedata and exists today in 2 editions, a community and enterprise one.
- And it’s worth mentioning pNFS with the asymmetric model with metadata and data servers as well and of course the parallelism between clients and the data storage layer (data servers).
The last interesting project is SaunaFS started by the Estonian company Leil Storage. They use the model to address secondary storage challenges with a Reed Solomon erasure coding and MAID oriented solution.
Like the previous listed examples, inspired by Google, SaunaFS uses metadata servers (Master, Shadows, Metaloggers), aata servers (Chunkservers), and clients (supporting Linux, Windows and MacOS and NFS).
The architecture uses a chunk-based oriented design with files fragments stored into 64MB chunks written by 64KB blocks coupled with 4B for CRC.
Beyond their own POSIX client, SaunaFS leverages Ganesha for NFSv4, Samba for SMB and MinIO for S3 when needed.
What is common here across many of these file storage product and demonstrated and confirmed at scale is that the only model that can sustain such high workloads is a asymmetric architecture with some sort of parallelism between computing consumer layers and the data production servers. Hyperscalers illustrate that trend perfectly and triggered plenty of other open source projects.
SaunaFS CEO: Alexander Ragel