# Huffman Coding

http://www.cmlab.csie.ntu.edu.tw/~zho/

Information Theory and Coding Technique ˇV Programming Homework I

Objective

Design the optimum Huffman Table for the given data source .
Implement effective Huffman Encoder & Decoder.

Requirement

• Build a Huffman Table, and write Symbol Prob. Codeword to a txt file R96944043_HuffT.txt line by line.
 Symbol Probilities Codeword 0 0.222 0101 ... ... ...
• Huffman Encoder, an executable win32 program and show the compression ratio.

>R96944043_HEnc.exe [HuffmanTableFile] [InputFile] [OutputFile]

R96944043:Compression_Ratio: 1.05 %
• Huffman Decoder, an executable win32 program and show the decompression speed.

>R96944043_HDec.exe [HuffmanTableFile] [InputFile] [OutputFile]

R96944043:Dec_Speed: 15 ms

Report

Basic idea: Using shorter codes for the most frequently occurring symbol.
Build Huffman tree:

1. Order the symbols according to their probabilities.
2. Apply a contraction process to the two symbols with the smallest probabilities. Replace them by a hypothetical symbol ,has a probability which is sum of these two smallest probabilities.
3. Repeat the step 1 until the final set has only contain one member.

Program: HuffmanTable.exe

Considering for the data sources, which are gray image files, the contain value is 0 to 255. For general use of Huffman table, I use byte for the symbol, so the codeword number will be 256.
I write the program to create Huffman Table. The program reads a HuffmanTable.ini file to get the data sources, and calculate the probabilities of the frequently occurring symbols, build Huffman tree and find codeword and codeword length.

Test Result:

 Test Data source Average Codeword Length Codeword Number 1 test1.bmp(469KB) 7.363922 256 2 test2.bmp(469KB) 7.586709 256 3 test1.bmp & test2.bmp 7.698559 256

Program: R96944043_HEnc.exe

Test Result:

 Huffman Table (build from data source) Input Output Compression Ratio test1.bmp test1.bmp test1_R96944043.huf 108.637627 % test1.bmp test2.bmp test2_R96944043.huf 88.018486 % test2.bmp test1.bmp test1_R96944043.huf 96.541534 % test2.bmp test2.bmp test2_R96944043.huf 105.447510 % test1.bmp & test2.bmp test1.bmp test1_R96944043.huf 105.084961 % test1.bmp & test2.bmp test2.bmp test2_R96944043.huf 102.771591 %

Because we can use only one Huffman Table, so I chose test 3 for the final table.

Program: R96944043_HDec.exe

I implement the algorithm of paper ˇ§Memory efficient and high-speed search Huffman codingˇ¨ to speed up decoding time.

• Build single growing Huffman tree.
• Every n level of tree builds a look up table(LUT), each LUT had n*2 rows.
• Fill every index of tree node to LUT.

Test Result:

 Input Output Decompression Speed(SGH) Decompression Speed (LUT 4 Level) Decompression Speed (LUT 8 Level) test1_R96944043.huf test1.dec.bmp 12.451862 ms 7.612420 ms 5.357385 ms test2_R96944043.huf test2.dec.bmp 14.964751 ms 8.689652 ms 6.182909 ms test1_R96944043.huf test1.dec.bmp 14.731202 ms 8.691328 ms 6.568712 ms test2_R96944043.huf test2.dec.bmp 13.040205 ms 8.123938 ms 5.858007 ms test1_R96944043.huf test1.dec.bmp 12.751062 ms 7.680864 ms 5.530312 ms test2_R96944043.huf test2.dec.bmp 13.211456 ms 8.298820 ms 6.009144 ms

I chose LUT 8 Level in my Huffman decoder for best speed.

References

• R. Hashemian, Memory efficient and high-speed search Huffman coding, IEEE Transactions on Communications, vol. 43, no. 10, pp. 2576-2581, October 1995.