Trie data structure:
A trie is also referred to as a prefix tree or radix tree and unlike B-Trees which store the keys in the node, a trie uses the position of the node to construct the prefix. All the descendants of a node have a common prefix associated with that node. The prefix starts out empty at the root and gets appended as we descend down the children. The term 'trie' comes from retrieval. A trie comes in very useful for auto-complete and spell-checker.
A trie stores words with the letters on the edges and as we walk down, we read one letter at a time. Words can end or be a prefix of others. To do a lookup, we just walk down the tree letter by letter and then see if the node has an ending $ to the node. If we get to a null pointer, we know that the word is not there. To efficiently save the prefixes, paths that don't branch into a single edge are compressed and only split when needed.
To lookup the words, we use radix sort.
void RadixSort(Array A, int digits)
{
for (int i = 1; i <= digits; i++)
InsertionSort(A, i);
}
void InsertionSort(A, d) {
for (int i = 1; i <= n; i++)
{
var x = A[i];
int j = i -1;
while (j >= 0 && A[j] > x)
{
// like arranging a hand of playing cards.
A[j + 1] = A[j];
j = j - 1;
}
A[j + 1] = x;
}
}
We now look at inserting words into a trie. When we add a word to a vertex, we will add word to the corresponding branch of vertex cutting the leftmost character of the word. If the needed branch does not exist, we will have to create it.
class TrieNode:
def addword(self, word):
if len(word) == 0:
self.isWord = True
return
key = word[0]
word = word [1:]
if self.next.has_key[key]:
self.next[key].add_item(input)
else:
node = TrieNode()
self.next[key] = node
node.add_item(string)
def dfs (self, sofar = None):
# we print in two cases
if self.next.keys() == []:
print 'Match:'+sofar
# this could be modified to look for similar words by backtracking to the parent and looking for siblings
if self.isWord:
print 'Match:' + sofar
for key in self.next.keys():
self.next[key].dfs(sofar)
def autocomplete(self, word, sofar=""):
if len(word) == 0:
if self.isWord:
print 'Match:', sofar
for key in self.next.keys():
self.next[key].dfs(sofar)
else:
key = word[0]
word = word[1:]
if self.next.has_key(key):
sofar = sofar + key
self.next[key].autocomplete(word, sofar)
courtesy:sarathlakshman.com
A trie is also referred to as a prefix tree or radix tree and unlike B-Trees which store the keys in the node, a trie uses the position of the node to construct the prefix. All the descendants of a node have a common prefix associated with that node. The prefix starts out empty at the root and gets appended as we descend down the children. The term 'trie' comes from retrieval. A trie comes in very useful for auto-complete and spell-checker.
A trie stores words with the letters on the edges and as we walk down, we read one letter at a time. Words can end or be a prefix of others. To do a lookup, we just walk down the tree letter by letter and then see if the node has an ending $ to the node. If we get to a null pointer, we know that the word is not there. To efficiently save the prefixes, paths that don't branch into a single edge are compressed and only split when needed.
To lookup the words, we use radix sort.
void RadixSort(Array A, int digits)
{
for (int i = 1; i <= digits; i++)
InsertionSort(A, i);
}
void InsertionSort(A, d) {
for (int i = 1; i <= n; i++)
{
var x = A[i];
int j = i -1;
while (j >= 0 && A[j] > x)
{
// like arranging a hand of playing cards.
A[j + 1] = A[j];
j = j - 1;
}
A[j + 1] = x;
}
}
We now look at inserting words into a trie. When we add a word to a vertex, we will add word to the corresponding branch of vertex cutting the leftmost character of the word. If the needed branch does not exist, we will have to create it.
class TrieNode:
def addword(self, word):
if len(word) == 0:
self.isWord = True
return
key = word[0]
word = word [1:]
if self.next.has_key[key]:
self.next[key].add_item(input)
else:
node = TrieNode()
self.next[key] = node
node.add_item(string)
def dfs (self, sofar = None):
# we print in two cases
if self.next.keys() == []:
print 'Match:'+sofar
# this could be modified to look for similar words by backtracking to the parent and looking for siblings
if self.isWord:
print 'Match:' + sofar
for key in self.next.keys():
self.next[key].dfs(sofar)
def autocomplete(self, word, sofar=""):
if len(word) == 0:
if self.isWord:
print 'Match:', sofar
for key in self.next.keys():
self.next[key].dfs(sofar)
else:
key = word[0]
word = word[1:]
if self.next.has_key(key):
sofar = sofar + key
self.next[key].autocomplete(word, sofar)
courtesy:sarathlakshman.com
No comments:
Post a Comment