pythonprefix-tree

Implementing a Trie to support autocomplete in Python


I'm trying to implement a data structure that supports autocomplete on a website. I've managed to implement an iterative version of a Trie. It supports the two primary methods of adding and searching in a Trie. However now I need to add a method that returns all the words that begin with the following prefix. Can someone help me with this.

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word):
        curr = self.root
        for letter in word:
            node = curr.children.get(letter)
            if not node:
                node = TrieNode()
                curr.children[letter] = node
            curr = node
        curr.end = True

    def search(self, word):
        curr = self.root
        for letter in word:
            node = curr.children.get(letter)
            if not node:
                return False
            curr = node
        return curr.end

    def all_words_beginning_with_prefix(self, prefix):
        #I'm not sure how to go about this one.

Solution

  • You could just implement a generator that iterates over the Trie according to prefix the same way as other methods do. Once you've found the node at the end of the prefix you can use recursive generator with yield from to iterate over the sub trie while keeping track of the prefix and yielding it when terminal node is found:

    class TrieNode:
        def __init__(self):
            self.end = False
            self.children = {}
    
        def all_words(self, prefix):
            if self.end:
                yield prefix
    
            for letter, child in self.children.items():
                yield from child.all_words(prefix + letter)
    
    class Trie:
        # existing methods here
        def all_words_beginning_with_prefix(self, prefix):
            cur = self.root
            for c in prefix:
                cur = cur.children.get(c)
                if cur is None:
                    return  # No words with given prefix
    
            yield from cur.all_words(prefix)
    
    trie = Trie()
    trie.insert('foobar')
    trie.insert('foo')
    trie.insert('bar')
    trie.insert('foob')
    trie.insert('foof')
    
    print(list(trie.all_words_beginning_with_prefix('foo')))
    

    Output:

    ['foo', 'foob', 'foobar', 'foof']