Huffman was a guy who realized that letters do not occur statisticly randomly; 'e' is used a lot more than 'j'. He figured you could build a balanced tree of letters (or anything else that occurs non-randomly, like words) and using that tree, represent letters in a minimal number of bits.
The original Huffman went like this:
1) count the occurances of the letters in your text to be compressed
2) build a binary tree such that the shortest paths in the tree go to the most referred-to letters
3) encode the tree as bits, where a 0 means go LEFT on the tree and 1 means go RIGHT.
Later variants allowed the tree to be built dynamically as you see the charactors, as in processing a stream of bytes.
Huffman is used as the post-processing in many compressors, like gzip (which first removes blocks of common letters and then Huffmanizes the result) and jpeg (which converts pixels into cosine values and Huffmanizes the resulting (nearly the same) values).
Google will get you to some sample code easily. Now that you know the basics, the code should be pretty readable.
thanks a lot !!!!
but i have a problem with the code i have counted the frequency but i am not able to build the heap which i have to build to finally make a tree.
help me here!!!!
No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
This thread is currently closed and is not accepting any new replies.