Most people can easily associate compression and deduplication in their mind, after all they have 2 common goals:
1.To reduce bandwidth going from the client to the backup server
2. To reduce storage space once the data resides on the backup server’s storage volume
Both deduplication and compression work alongside each other to accomplish the same goals, but both of them are also independent. Compression does not affect your deduplication ratios, and deduplication does not effect your compression ratios. Backup providers will often combine both ratios for an overall “benefit ratio” for your convenience.
Encryption and deduplication on the other hand hold an even tighter bond. A bond which directly effects your deduplication protected to storage ratio. To understand this bond though, one must first understand how deduplication works in general.
Deduplication works by creating a hash (or checksum) on each block of data being backed up, before encryption and compression is performed. Existence of a block of data on a backup server can be checked by doing a lookup to see if the hash already exists on the storage server. If a particular hash already exists on a storage server, the block of data does not need to be re-transferred nor re-stored to disk. Instead a reference to that data will simply be created instead.
Each company that backs up has access to only the files they backup; however, some companies require extra security. These companies want to hold the encryption key themselves, and never communicate it across their computer to the backup server. The files on the backup server storage disk must be encrypted with an encryption key that only that particular customer knows of. Since the server storage does not know the encryption key, this makes deduplication impossible for that particular block of data across customers.
ROBOBAK by default allows vault wide deduplication. Vault wide deduplication means that your deduplication ratios of protected vs. stored can be huge. We accomplish this by randomly encrypting each block of data being backed up with a unique encryption key, and then associating each user access to each of these unique keys.
ROBOBAK also recognizes that some companies require that the encryption key used to encrypt the files being backed up is only known to them. This gives a customer extra piece of mind, because by definition, only they can restore from these files because the backup server does not even know what the encryption key is.
In the preferences of the backup client a user can change the default to use a customer and user encryption key which is only known to them. When using these encryption keys though, deduplication ratios will not be vault wide. ROBOBAK also allows you to allow and disallow subsets of this functionality and fill defaults via its free customization kit utility.
Leave a Reply
Want to join the discussion?Feel free to contribute!