javahuffman-codehuffman-tree

Huffman Coding Java


I have this question:

Consider a DMS with seven possible symbols Xi, i = 1, 2, ... , 7 and the corresponding probabilities p1 = 0.37, p2 = 0.33, p3 = 0.16, p4 = 0.07, p5 = 0.04, p6 = 0.02, and p7 = 0.01? We first arrange the probabilities in the decreasing order and then construct the Huffman tree as in Fig. 3.3.table answer

I am to write it in Java. I wrote the solution below for it.

However, it's not giving me the same answer as the table in the question. The output is:

1          0.37         1.4344          0
2          0.33         1.5995          11
3          0.16         2.6439          101
4          0.07         3.8365          1000      
5          0.04         4.6439          10011
6          0.02         5.6439          100101
7          0.01         6.6439          100100

However it should be this:

1          0.37         1.4344          0
2          0.33         1.5995          10
3          0.16         2.6439          110
4          0.07         3.8365          1110
5          0.04         4.6439          11110
6          0.02         5.6439          111110    
7          0.01         6.6439          111111

Why is it doing this? And how can I fix it?

Here is the code:

import java.util.*;

class HuffmanNode {
  double probability;
  int symbol;
  HuffmanNode left, right;

  HuffmanNode(double prob, int sym) {
    this.probability = prob;
    this.symbol = sym;
    left = right = null;
  }
}

class HuffmanCoding {
  public static void main(String[] args) {
    double[] probabilities = { 0.37, 0.33, 0.16, 0.07, 0.04, 0.02, 0.01 };
    int[] symbols = { 1, 2, 3, 4, 5, 6, 7 };

    HuffmanNode root = buildHuffmanTree(probabilities, symbols);
    Map<Integer, String> huffmanCodes = new HashMap<>();
    generateHuffmanCodes(root, "", huffmanCodes);
    printTable(huffmanCodes, probabilities);
  }

  private static HuffmanNode buildHuffmanTree(double[] probs, int[] syms) {
    PriorityQueue<HuffmanNode> pq = new PriorityQueue<>(Comparator.comparingDouble(node -> node.probability));

    for (int i = 0; i < probs.length; i++) {
      pq.offer(new HuffmanNode(probs[i], syms[i]));
    }

    while (pq.size() > 1) {
      HuffmanNode left = pq.poll();
      HuffmanNode right = pq.poll();
      HuffmanNode newNode = new HuffmanNode(left.probability + right.probability, -1);
      newNode.left = left;
      newNode.right = right;
      pq.offer(newNode);
    }

    return pq.poll();
  }

  private static void generateHuffmanCodes(HuffmanNode node, String code, Map<Integer, String> codes) {
    if (node.left == null && node.right == null) {
      codes.put(node.symbol, code);
      return;
    }
    if (node.left != null) {
      generateHuffmanCodes(node.left, code + "0", codes);
    }
    if (node.right != null) {
      generateHuffmanCodes(node.right, code + "1", codes);
    }
  }

  private static void printTable(Map<Integer, String> codes, double[] probabilities) {
    System.out.printf("%-10s %-12s %-15s %-10s%n", "Symbol", "Probability", "Self-information", "Codeword");
    System.out.println("-".repeat(55));

    for (int i = 0; i < probabilities.length; i++) {
      double selfInfo = -Math.log(probabilities[i]) / Math.log(2);
      System.out.printf("%-10d %-12.2f %-15.4f %-10s%n", i + 1, probabilities[i], selfInfo, codes.get(i + 1));
    }
  }
}

Solution

  • There is nothing wrong with your answer. The choice to assign a 0 or 1 bit to each left or right branch is arbitrary. Your output and the table in the question are both correct, and are both optimal prefix codes for the provided probabilities. Since there are six branches, there are 64 possible codes that could be produced, all of which are valid.

    Though there is no need to, it is possible to produce the codes in the table image by replacing the first two lines of code after the while (pq.size() > 1) { with these:

        HuffmanNode right = pq.poll();
        HuffmanNode left = pq.poll();
        if (left.symbol == -1 && right.symbol != -1) {
            HuffmanNode temp = left;
            left = right;
            right = temp;
        }
    

    This reverses the order of probabilities on each branch, and forces leaves to the right branch.