Monday, June 29, 2015

Trie data structure:
A trie is also referred to as a prefix tree or radix tree and unlike B-Trees which store the keys in the node, a trie uses the position of the node to construct the prefix. All the descendants of a node have a common prefix associated with that node. The prefix starts out empty at the root and gets appended as we descend down the children. The term 'trie' comes from retrieval. A trie comes in very useful for auto-complete and spell-checker.
A trie stores words with the letters on the edges and as we walk down, we read one letter at a time. Words can end or be a prefix of others. To do a lookup,  we just walk down the tree letter by letter and then see if the node has an ending $ to the node. If we get to a null pointer, we know that the word is not there. To efficiently save the prefixes, paths that don't branch into a single edge are compressed and only split when needed.
To lookup the words, we use radix sort.
 void RadixSort(Array A, int digits)
{
for (int i = 1; i <= digits; i++)
      InsertionSort(A, i);
}

void InsertionSort(A, d) {
for (int i = 1; i <= n; i++)
{
var x = A[i];
int j = i -1;
while (j >= 0 && A[j] > x)
{
// like arranging a hand of playing cards.
A[j + 1] = A[j];
j = j - 1;
}
 A[j + 1] = x;
}
}

We now look at inserting words into a trie. When we add a word to a vertex, we will add word to the corresponding branch of vertex cutting the leftmost character of the word. If the needed branch does not exist, we will have to create it.

class TrieNode:
         def addword(self, word):
                if len(word) == 0:
                    self.isWord = True
                    return
                 key = word[0]
                 word = word [1:]
                 if self.next.has_key[key]:
                     self.next[key].add_item(input)
                 else:
                      node = TrieNode()
                      self.next[key] = node
                      node.add_item(string)

         def dfs (self, sofar = None):
                # we print in two cases
                if self.next.keys() == []:
                    print 'Match:'+sofar
                       # this could be modified to look for similar words by backtracking to the parent and looking for siblings
                if self.isWord:
                    print 'Match:' + sofar
                for key in self.next.keys():
                    self.next[key].dfs(sofar)

        def autocomplete(self, word, sofar=""):
                if len(word)  == 0:
                    if self.isWord:
                        print 'Match:', sofar
                    for key in self.next.keys():
                        self.next[key].dfs(sofar)     
               else:
                    key = word[0]
                    word = word[1:]
                    if self.next.has_key(key):
                        sofar = sofar + key
                        self.next[key].autocomplete(word, sofar)
                 
    

courtesy:sarathlakshman.com

No comments:

Post a Comment