IBM General Parallel File System V3.5
Introduces File Placement Optimizer for Linux.
This is a Press Release edited by StorageNewsletter.com on December 18, 2012 at 2:57 pmThe General Parallel File System (GPFS) from IBM Corp. is a cluster file system designed for high-performance, parallel file access and management.
It delivers reliability, multicluster support, scalability and performance with automated failure recovery, and decentralized data management for simplifying administration. It provides several essential services to allow to manage growing quantities of structured and unstructured data.
The new File Placement Optimizer (FPO) feature, available with GPFS V3.5 for Linux, extends GPFS for a new class of data-intensive applications, commonly referred to as big data applications. They involve processing massive amounts of data with a focus on semantically transforming the data. This class of applications is massively parallel and suited for programming frameworks such as MapReduce that allow users to do large-scale data analysis where the application execution layer handles the system architecture, data partitioning, and task scheduling.
A new license type is introduced for FPO nodes in a GPFS cluster that are sharing data with other FPO nodes, but not using GPFS for other usage that requires a GPFS Server license. You must purchase either a GPFS Server license or the new GPFS FPO license for nodes in a GPFS cluster using the GPFS FPO function.
GPFS FPO is an extension of GPFS that is designed to support this emerging class of workloads by applying five optimizations:
- Locality awareness to allow compute jobs to be scheduled on nodes where the data resides
- Metablocks that allow large and small block sizes to co-exist in the same file system to meet the needs of different types of applications
- Write affinity that allows applications to dictate the layout of files on different nodes in order to maximize both write and read bandwidth
- Pipelined replication to maximize use of network bandwidth for replication
- Distributed recovery to minimize the effect of failures on ongoing computation
Planned availability date
December 14, 2012