When most people think of cloud storage, they imagine an infinite horizon of bits and bytes, a vast, limitless space in which to store their data. But cloud storage is sometimes not the avarice that people imagine. For applications that have heavy disk I/O activities or transfer a lot of data, cloud storage can become a significant cost and inherently limiting.

But there is a solution to reducing high I/O activities and unnecessary use of bandwidth with cloud storage—data de-duplication which is described as the process of reducing duplication in data transfers prior to transfer. This technology works by comparing differential changes in data or files before transferring the data to its destination. By employing a data de-duplication method, companies can significantly reduce costs associated with application storage in the cloud; first, by transferring only files that have changed thereby saving on bandwidth costs; second, by reducing I/O on the storage cluster and storing only what is changed.

What is Source Based Data De-Duplication?


Yet not all data de-duplication solutions are created equal. In fact, in many cases, the implementation of the technology is far from efficient. Although there is some benefit gained in only transferring files that have changed, if those files are significantly large such as with geographic survey images that may be 100s of gigabytes in size, there is little gained from file level de-duplication.

That’s what makes a solution such as ROBOBAK so powerful. Rather than transferring whole files with any sort of change, ROBOBAK focus on actual blocks of data, merging the incremental blocks back into a file only at restore time. For example, let’s say there was a 1MB file that needed to be transferred because it had changed. ROBOBAK would reduce the file to four ~250KB blocks and look for differential changes within the blocks, transferring only those blocks that have changed rather than the whole file. This provides an even greater efficiency over other data de-duplication technologies by enabling further savings on bandwidth and disk I/O. ![alt text][1]

Data de-duplication is a perfect technology for companies wanting to utilize the cloud for data storage because its inherent operation is meant to save on bandwidth and storage by transferring or storing only those files that have changed. But the savings are only as good as the solution and none are better than ROBOBAK…block-level data de-duplication technology for your business cloud based backup strategy.

