What are you looking for ?
Advertise with us
RAIDON

False Perceptions About Source-Based De-Dupe During Backup – Synerway

Other block mode methods recommended

We have received this article from Synerway (groupe Resadia), since a long time not being an advocate of de-dupe for backup.

The marketing and sales teams of some major American backup providers have latched on to deduplication as the new universal cure for all ills.

Deduplication is the must-have component of any backup solution.

Far too often, one of the first questions asked by a potential client to a backup vendor is, “Does your product do deduplication?” And it’s hard luck if the answer is, “No, we don’t“. Or the tender is very clear that deduplication is mandatory and a candidate will be eliminated if they do not have it!

But since when has deduplication been a prerequisite for backing up data?

The real objectives of an IT department when searching

for a backup solution are they not the following?

  • Guarantee backups and restorations in all circumstances
  • A solution which is reliable and which requires the least possible administration
  • Anticipate growing data volumes
  • The lowest possible TCO
  • An RPO and RTO as close as possible to zero

There are two apparent advantages to a source-based deduplication block-mode backup solution:

  • The first is storage economy and therefore disk space economy. This is true when compared to a file-mode backup solution but not at all for a non-deduplicated block mode solution. What’s more, any advantages need to be tempered down given that disks are cheaper and cheaper and more and more efficient.
  • The second advantage comes from reduced bandwidth requirements since only deduplicated modified blocks are transferred over the WAN.

But yet again, we need to water down any over-enthusiasm because of the amount of round-trip activity between backup servers and protected servers due to the complex management of modified block signatures. Remember that signatures are essential to restore any data.

But these two apparent advantages have two distinct drawbacks:

  • Signature management uses a lot of backup server. This prevents backups from starting when servers are very busy and can mean that backups can only be run during the night.
  • The backup server database which we call the catalogue becomes potentially a weak point because it is essential to restore data from the small blocks of deduplicated data.
  • The backup catalogue grows very rapidly because of all the signatures that are attached to each modified block. And because the catalogue grows with each backup, the backup server’s performance will also decline over time.
  • For the same reason, the reliability of the catalogue will also deteriorate with increased risk of “catalogue rupture” with the irremediable loss of data already backed up on disk.
  • The externalisation of backed up data to an external media such as LTO tape, which may be essential to keep backup histories over several years, becomes very laborious or even highly risky or even impossible because how can we reasonably guarantee restoration to a coherent state from blocks of deduplicated data and thousands and thousands of signatures?

To conclude, there are many preconceived ideas around source-based data deduplication:

  • Deduplication accelerates backup: false
  • Signature calculations slow down backups
  • Deduplication reduces resources required by other servers: false
  • The opposite is true; more CPU is required to protect machines (disk searches, signature and block creation and management.
  • Deduplication greatly reduces bandwidth requirements: false
  • Deduplication requires lots of round trip data checking on protected servers and backup servers which can in some cases actually increase bandwidth.
  • Deduplication greatly reduces backed up storage: false
  • Compared to standard block mode, the reduction of data for a protected server is typically less than 5%.
  • Deduplication guarantees the best RPO: false
  • Heavy processing requirements for deduplication require backups to be spaced out

This is why Synerway recommends other block mode methods such as the CDP-DiskSafe agent which uses disk mirroring technology. Protected disks are mirrored to an appliance and changed blocks flushed over in real time or periodically. No catalogue, very low CPU, full externalisation capacity and no data loss.

Articles_bottom
ExaGrid
AIC
ATTOtarget="_blank"
OPEN-E