I have a tree with the most frequent letters at the top of the tree and I am trying to find the path to a node so I can turn a message into binary. In this program if I go left, I add 0 to path and, if I go right, I add 1 to the path until I find the node. But I have to go straight to the desired node which is not possible. The only thing I could think of is removing the last character or path if a node has no children, but it does not work if a node has grandchildren. Can someone help me on how to approach this? Thanks!
// global variables
String path;
int mark;
// encodes a message to binary
String encoder(char data) {
path = "";
mark = 0;
findPath(root, data);
return path;
}
// finds the path to a node
void findPath(TNode node, char data) {
if(node.data == data) {
mark = 1;
return;
}
if(mark==0 && node.left != null) {
path += 0;
findPath(node.left, data);
}
if(mark==0 && node.right != null) {
path += 1;
findPath(node.right, data);
}
if(mark==0 && node.left == null || node.right == null) {
path = path.substring(0, path.length() - 1);
}
}
This is not how Huffman coding works. Not at all. You assign a code to every symbol, and symbols can have different lengths.
An algorithm to determine the code: Take two symbols with the lowest frequency, say an and b. Combine them into one symbol x with higher frequency obviously, and when we are finished the code for a is the code for x plus a zero, and the code for b is the code for x plus a one. Then you look again for the two symbols with lowest frequency and combine them and so on. When you are down to two symbols you give them codes 0 and 1 and find all the other codes for symbols.
Example: a, b, c, d with frequencies 3, 47, 2 and 109. Combine a and c to x with frequency 5. Combine x and b to y with frequency 52. Then a = code 0, y = code 1. x = code 10 and b = code 11. Then a = code 100 and c = code 101.