Data Deduplication

What Does Data Deduplication Mean?

Data deduplication is a data compression technique in which redundant or repeated copies of data are removed from a system. It is implemented in data backup and network data mechanisms and enables the storage of one unique instance of data within a database or information system (IS).


Data deduplication is also known as intelligent compression, single instance storage, commonality factoring or data reduction.

Techopedia Explains Data Deduplication

Data deduplication works by analyzing and comparing incoming data segments with previously stored data. If data is already present, data deduplication algorithms discard the new data and create a reference. For example, if a document file is backed up with changes, the previous file and applied changes are added to the data segment. However, if there is no difference, the newer data file is discarded, and a reference is created. Similarly, a data deduplication algorithm scans outgoing data on a network connection to check for duplicates, which are removed to increase data transfer speed.


Related Terms

Margaret Rouse

Margaret Rouse is an award-winning technical writer and teacher known for her ability to explain complex technical subjects to a non-technical, business audience. Over the past twenty years her explanations have appeared on TechTarget websites and she's been cited as an authority in articles by the New York Times, Time Magazine, USA Today, ZDNet, PC Magazine and Discovery Magazine.Margaret's idea of a fun day is helping IT and business professionals learn to speak each other’s highly specialized languages. If you have a suggestion for a new definition or how to improve a technical explanation, please email Margaret or contact her…