Tech moves fast! Stay ahead of the curve with Techopedia!
Join nearly 200,000 subscribers who receive actionable tech insights from Techopedia.
Huffman coding is a lossless data encoding algorithm. The process behind its scheme includes sorting numerical values from a set in order of their frequency. The least frequent numbers are gradually eliminated via the Huffman tree, which adds the two lowest frequencies from the sorted list in every new “branch.” The sum is then positioned above the two eliminated lower frequency values, and replaces them in the new sorted list. Each time a new branch is created, it moves the general direction of the tree either to the right (for higher values) or the left (for lower values). When the sorted list is exhausted and the tree is complete, the final value is zero if the tree ended on a left number, or it is one if it ended on the right. This is a method of reducing complex code into simpler sequences and is common in video encoding.
Data compression has a history that predates physical computing. Morse code, for example, compresses information by assigning shorter codes to characters that are statistically common in the English language (such as the letters “e” and “t”). Huffman coding came about as the result of a class project at MIT by its then student, David Huffman.
In 1951, Huffman was taking a class under Robert Fano, who (with the help of an engineer and mathematician by the name of Claude Shannon) invented an efficiency scheme known as Shannon-Fano coding. When Fano gave his class the opportunity to either write a term paper or take a final exam, Huffman chose the term paper, which sought to find an efficient binary coding method. This resulted in Huffman coding, which by the 1970s had become a prominent digital encoding algorithm.