c++huffman-codeprefix-tree

How to generate Huffman prefix code?


I'm working on a c++ program and my group and I cannot figure out the logic on how to generate the prefix code using an inOrder traversal. We have the PrefixCode Tree built and we want to generate the codes from that.

Since we are having issues with the logic I don't know if I need to provide any code. If you need any let me know.

Thanks.


Solution

  • As commented, I am not sure of understanding. Buy I will try ...

    First, I assume that you have already built correctly the tree.

    Second, each leaf contains a symbol to be be encoded. The code of each symbol is defined by the path from the root to the leaf. Initially, when you are at the root, the code is empty. If you go to the left, then you concatenate a 0 to the code. Symmetrically, if you go to the right, then you concatenate 1.

    Third, I assume that you want the list of pairs <symbol, code>. symbol is a character and code is a string containing the prefix code. The result is the list of all the symbols represented by the tree along with their codes.

    Now, an algorithm for producing that list would use a stack for storing the prefix from the root to the current node. Each time when you arrive to a leaf, then the code is into the stack, but inversed. Such algorithm could be as follows:

    void codes(Node * root, stack<char> & s)
    {
      if (is_leaf(root))
        {
          auto symbol = // symbol stored in the node
          string code = // the reversed stack content; that is the code of current leaf
          cout << symbol << " = " << code << endl;
        }
    
      s.push('0');
      codes(LLINK(root), s);
      s.pop();
    
      s.push('1');
      codes(RLINK(root), s);
      s.pop();
    }
    

    I propose a stack because is the "natural" data structure for storing the partial path from the root to a node. However, maybe it would be better a vector<char>. In this case you should replace push() by push_back() and pop() by pop_back(). But the advantage would be that you can build the code with a reverse iterator.

    Also, instead of instruction cout ..., you could use a list or a vector and insert in it the pair. After, if you wish, you could sort it. Alternatively, you could use a "naturally sorted" data structure such a binary search tree and you avoid to sort. All this of course if you interest is to produce a sorted by symbol output.

    I hope it has been helpful