Efficient Management of Legal and Compliance Requests on D2D Storage
By Jim McGann, VP Information Discovery, Index Engines
This is a Press Release edited by StorageNewsletter.com on December 28, 2011 at 2:33 pmThis is a white paper written by Jim McGann, VP Information Discovery, Index Engines, Inc.
D2D storage systems are designed to cycle through backups according to disaster recovery rotation policies. Old backup images are cycled out as new backups occur and take their place.
However, the cycle is often interrupted by legal hold and compliance requests that require backup data to be retained for a specified period of time. This process is causing large volumes of user data to be retained, causing backup storage devices to quickly become full and difficult to manage.
Legal, compliance and regulatory requests typically involve preservation of specific users email, within a defined date range. Managing these requests requires the securing of individual mailboxes and expiring them, based on policy. However, when backup data is used to satisfy these requests, access to individual mailboxes is impossible, causing far more data to be preserved than is required. As a result D2D backup storage is quickly reaching capacity and requiring unplanned expansion and expensive upgrades.
Intelligent Access to Backup Data
The challenge in working with backup data is that the level of knowledge of the contents is minimal. Backup catalogs provide some detail, however when you are required to provide access and preservation, of specific user mailboxes, the only solution is restoration of the entire email database. As a result, preservation requests commonly involve holding and securing the entire backup image or the complete email database. This process results in the preservation of significantly more data than is required, quickly filling up storage devices.
Direct backup image access, a process patented by Index Engines, delivers the level of knowledge required to streamline access and management of backup data on disk or tape. Using direct access, backup images are scanned and indexed regardless of the format (TSM, NetBackup, Networker, etc.). The contents are fully processed, including all email databases (EDB, PST, NSF, etc.) and user content. Searching and finding specific user email or files is possible. When a preservation order is requested, finding individual content or entire mailboxes is a simple search. All content and metadata can be queried, including date ranges, to/from/cc/bcc, owners, locations and more. This allows the hold requests to be specific, narrowing down the data to be preserved, to a small subset of the backup image.
Once the data is found, direct access technology extracts it from the backup image, without the need for the original software. So if you have an email in an Exchange database backed up using NetBackup, you will not need access to NetBackup or MS Exchange in order to extract user data from tape. This same process can be applied to legacy backups that may no longer be in production.
Using this new process, legal hold and preservation requests can be narrowed down to a set of relevant data and once the preservation request is lifted, the data can be expired. The end result is that this saves long term storage costs and does not fill up the D2D backup systems with legal data that can never be expired.
Policy Based Backup Data Management
Detailed knowledge of backup data is valuable, to support legal preservation and hold requests. With deep knowledge of the content, any file or email can be found and produced. Reacting to these requests will result in only the required data to be saved and not extraneous content. As data is extracted it is done so in its native format. Word documents, Excel spreadsheets and email in msg or eml formats. This content can be stored in a legal archive and not on the backup storage device. This leaves the backup device to manage backups, and not legal holds.
Index Engines, Inc.’s ability to directly search and access data in backup images (tape and disk), can also be applied to manage ongoing corporate policies. Based on organizations retention policies for user data, those requests can be defined and processed against the current backup content. The resulting data will then be extracted from the backup image and archived for long term preservation. Leveraging the backup process, to populate an archive, to support corporate policies streamlines the information management process. Corporate legal can define and modify policies and the automated search and extraction process will leverage backup, to satisfy these requests. Legal hold and preservation according to policy, becomes an automated process, based on backup procedures.
Streamline Backup
An additional benefit of direct access and knowledge, of backup data, is the ability to streamline the backup process. Profiling data that is backuped, delivers knowledge of files and email that will allow decisions to be made, that can simplify the backup process. Reports include which users are backuping the largest volume of data; age of the content, including what percentage of files have not been accessed or modified in x years; volume of data by server, or file type and percentage of duplicate content.
Based on this knowledge, network servers can be cleaned and managed efficiently. Unnecessary user data can be archived offline. Redundant content can be eliminated. A large percentage of what is being backuped, on a daily basis, can be better managed and eliminated, based on knowledge of the content. Streamlining the backup process will save time and money and long term storage costs.
Conclusion
As user data moves into backup images, it becomes difficult to manage. This challenge has resulted in increased storage costs and extended backup windows, due to inefficient knowledge of the content. With detailed knowledge, data can be managed more effectively. Legal hold and preservation requests can be satisfied quickly and efficiently. Data on hold can be moved to an archive and then expired when appropriate. No longer would backup storage be used to inefficiently manage legal hold requests.
Smarter access and management of user data will help save storage costs. Only direct indexing of the backup content delivers the freedom to make these decisions. Saving only what is required and expiring what is not, will save storage and improves responsiveness to the legal and compliance teams. Using direct backup image access from Index Engines, the life of your D2D backup system can be extended and support for legal and compliance policies can be simplified.