Tech moves fast! Stay ahead of the curve with Techopedia!
Join nearly 200,000 subscribers who receive actionable tech insights from Techopedia.
Data deduplication is a data compression technique in which redundant or repeated copies of data are removed from a system. It is implemented in data backup and network data mechanisms and enables the storage of one unique instance of data within a database or information system (IS).
Data deduplication is also known as intelligent compression, single instance storage, commonality factoring or data reduction.
Data deduplication works by analyzing and comparing incoming data segments with previously stored data. If data is already present, data deduplication algorithms discard the new data and create a reference. For example, if a document file is backed up with changes, the previous file and applied changes are added to the data segment. However, if there is no difference, the newer data file is discarded, and a reference is created. Similarly, a data deduplication algorithm scans outgoing data on a network connection to check for duplicates, which are removed to increase data transfer speed.