IBM S3 Tiering to Tape with NooBaa Part 1 – Introduction
Object storage solutions providing S3 capabilities become more and more relevant for new use cases around hybrid cloud storage and storage for data and AI.
This is a Press Release edited by StorageNewsletter.com on March 6, 2024 at 3:02 pmBy Nils Haustein, Jan-Frode Myklebust, Guy Margalit, Khanh V Ngo, IBM Corp.
Object storage solutions providing S3 capabilities become more and more relevant for new use cases around hybrid cloud storage and storage for data and AI.
Modern Data Lakehouses use object storage as the backend storage for structured and unstructured data. Backup applications use object storage to backup and offload data. One common requirement is to tier objects to tape. This requires object storage architectures providing scalable and high performing storage in the front end and allowing to tier aged objects to tape.
In this series of blog articles, we introduce a new and modern S3 object storage services that can be installed on IBM Storage Scale. The S3 object storage service is provided by an open-source software called NooBaa. NooBaa in combination with the company’s Storage Scale and IBM Storage Archive Enterprise Editions allow transparent tiering of objects and buckets to tape.
This series consists of multiple parts, the parts will be linked here once published.
- Part 1 (this article): Explains what NooBaa is and highlights the architecture of the solution integrating NooBaa with IBM Storage Scale.
- Part 2: Demonstrates how to install and configure NooBaa on Storage Scale file systems and use NooBaa services to PUT and GET objects and buckets using S3 clients.
- Part 3: Provides a brief introduction to Storage Scale information lifecycle management allowing to tier data to tape in combination with IBM Storage Archive. It also demonstrates how S3 buckets and objects can tiered to tape while providing seamless access to data.
- Part 4: Highlights capabilities that can be used to improve usability for buckets and object stored on relatively slow tape devices.
What is NooBaa?
It is a customizable and dynamic data gateway, providing S3 data services over any storage resource including S3, GCS, Azure Blob, Filesystems, etc. [1]. It provides S3 endpoints to users and application and allows full control over data placement with dynamic policies per bucket or account.
It grew up in the ‘container world’ and is integral part of Red Hat Data Foundation (RDF). It is open source [2] and NooBaa-core standalone can be provided as standalone software package for Red Hat Linux.
The NooBaa-core standalone software package can be deployed on an Storage Scale cluster providing the S3 endpoints while the buckets and objects are stored in Storage Scale file systems. This is the foundation of modern S3 object storage services on the company’s Storage Scale. Furthermore, by leveraging the integration of Storage Scale with IBM Storage Archive S3 buckets and object stored in the file system can tiered to tape.
Disclaimer
When deploying open-source NooBaa-core standalone, there are a few things to consider:
- It is open-source and can be used by anybody respecting the associated open-source license.
- As open-source software, it is work in progress with varying stability and functions.
- Problem discovered with it can be addressed as issues in the repository on GitHub [2].
There are plans to integrate NooBaa in Storage Scale as the new modernized S3 object stack. Initially, this integration does not support tiering to tape. However, you can use NooBaa as open-source solution to provide an S3 object storage with tiering to tape.
Architecture with NooBaa on Storage Scale
Storage Scale is a modern data platform providing a set of storage services, including but not limited to:
- Clustered file systems with no single point of failure
- Parallel and high performant data access on disk
- Active–active stretched cluster architecture with synchronous replication and site failure tolerance
- Date protection through snapshots, backup, and asynchronous replication across large distances
- Data lifecycle management across different tiers of storage powered by an intelligent policy engine.
Picture below shows architecture of solution allowing to tier S3 objects on tape
The open-source software NooBaa is installed on one or more Storage Scale cluster nodes and provides the S3 object storage endpoints to the S3 users and applications. It is configured in name space file system mode (nsfs) allowing to store buckets and objects in file systems. The file systems are provided by Storage Scale. Objects and buckets are stored on disk managed by Storage Scale file systems. The firm’s Storage Scale policy engine in combination with the company’s Storage Archive is used to tier objects and buckets to tape. Storage Archive manages the tape resources and writes the objects and buckets in LTFS format to tapes.
The NooBaa configuration files and object data are stored in distinct shared directories of Storage Scale file systems that are accessible by all NooBaa nodes. The NooBaa configuration files include account and bucket configuration as well as NooBaa service customization. Its directory can be in the same file system where the buckets and objects are stored or in a different file system. It is recommended to use a different file system for the NooBaa configuration files.
In the next article in this series we demonstrate how easy it is to install, configure and use NooBaa S3 object storage services on Storage Scale. If you want to try it out, you need a Storage Scale cluster (single node is sufficient). We recommend Storage Scale version 5.1.8+ on RHEL 8 or 9.
[1] NooBaa documentation
[2] NooBaa-core open source repository on GitHub