pythonpython-3.xalgorithmmemorymemory-leaks

Leetcode Python 208 -- memory not clearing from previous test case?


The problem for reference: https://leetcode.com/problems/implement-trie-prefix-tree

It seems straightforward enough, implement a trie. However, during one of the submission tests, where the input is

["Trie","startsWith"]
[[],["a"]]

The expected answer is [null,false], while my answer returned [null,true]. However, looking through my code, I was confident that my logic is correct and it should have returned False.

Upon further debugging by adding a print representation and printing out the input, I realised that despite the question input not having any insert, it in fact passed in a Trie with app and apple, presumably from the previous test case.

I read through the comments and found a couple of comments mentioning a similar issue, and a Github Issue at https://github.com/LeetCode-Feedback/LeetCode-Feedback/issues/6108, which was closed, even though the problem is still there.

Funnily enough, if I try to run the exact same code and test case using LeetCode's debugger mode, it returns the correct result.

While I am almost sure this was something wrong with Leetcode, this problem does not occur when I tried to use paste in other users' accepted solution. This means that there is likely both something wrong with leetcode, and something wrong with my code to reproduce this specific bug.

Code below for reference:

from collections import deque

class Trie:
    def __init__(self, children = {}, value = '', is_terminal = False):
        self.children = children
        self.value = value
        self.is_terminal = is_terminal

    def __str__(self):
        s = ''
        s += self.value
        queue = deque([])
        for child in self.children.values():
            queue.append(child)
        while queue:
            node = queue.popleft()
            s += node.value
            if node.is_terminal:
                s += 'T'
            for child in node.children.values():
                queue.append(child)
        return s

    def __repr__(self):
        return self.__str__()

    def get_child(self, val) -> "Trie":
        return self.children[val]

    def has_child(self, val) -> bool:
        return val in self.children

    def set_terminal(self) -> None:
        self.is_terminal = True
        
    def insert(self, word: str) -> None:
        node = self
        for char in word:
            if not node.has_child(char):
                new = Trie({}, char, False)
                node.children[char] = new
            node = node.get_child(char)

        node.set_terminal()

    def search(self, word: str) -> bool:
        node = self
        for char in word:
            if not node.has_child(char):
                return False
            node = node.get_child(char)
        return node.is_terminal

    def startsWith(self, prefix: str) -> bool:
        print(self). # this returns appTleT
        node = self
        for char in prefix:
            if not node.has_child(char):
                return False
            node = node.get_child(char)
        return True

Solution

  • Just to illustrate the comment I made earlier, a MRE of it

    class MC():
        def __init__(self, x={}):
             self.x=x
        def __repr__(self):
            return '{'+','.join(f'{k}:{self.x[k]}' for k in self.x)+'}'
        def set(self, k, v):
             self.x[k]=v
    

    Then you can easily experiment

    >>> mc1=MC()
    >>> mc1
    {}
    >>> mc1.set(1,2)
    >>> mc1
    {1:2}
    >>> mc2=MC()
    >>> mc2
    {1:2}
    

    Second mc2 uses the same default value for x. Which is a dict whose initial value was {}, but now is {1: 2}. It is not just the local x that set modifies. Well, it is. But that local x points to the default parameter {} (I mean "default parameter whose initial value is {}")

    So, rule of thumb: no mutable values as default parameter, unless you know what you are doing (for example, unless you are sure that, anyway, you never mute it. Or on the contrary, that is the exact intend behavior, to implement a kind of class attribute that way — still a bad idea in that case imho, but I wouldn't discuss style if it is intended)

    For my code, the way to fix this could be

    class MC():
        def __init__(self, x=None):
             self.x={} if x is None else x
        def __repr__(self):
            return '{'+','.join(f'{k}:{self.x[k]}' for k in self.x)+'}'
        def set(self, k, v):
             self.x[k]=v
    

    There, None in not mutable (and anyway, we never try to modify it). And the {} we assign to self.x is, sure, but it is a different {} for every call of __init__. So each instance has its own, as you certainly want.

    You can adapt that to your code. But basically (now that I read it), you do the exact same thing. Each instance share the same "intially {}, but not for ever" default value for children. So each instance that was created without a parameter for children shares a same children. And then, in insert when you

    node.children[char] = new
    

    you alter that default value (not only node.children, but also the .children of any other instance that was created without parameter.