Huffman Coding
What is huffman coding?
- Huffman coding is a method of lossless compression primarily used on text based data (documents)
- A huffman coding tree is used to compress the data whilst keeping all the data so that it can be uncompressed back to its original state
Example (step 1)
- A text file contains the following characters
Text file | |||||||||||||||
I | L | O | V | E | C | O | M | P | U | T | E | R | S |
- The text file contain 16 characters including spaces
- A frequency analysis shows there is some repetition in the characters, for example there are 2 x E's
Frequency | ||||||||||||
I | space | L | O | V | E | C | M | P | U | T | R | S |
1 | 2 | 1 | 2 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |