I'm learning python and I'm curious about how people choose to store (binary) trees in python.
Is there something wrong in storing the nodes of the tree as a list in python? something like:
[0,1,2,3,4,5,6,7,8]
where the 0'th position is 0 by default, 1 is the root, and for each position (i), the 2i and 2i+1 positions are the children. When no child is present, we just have a 'None' in that position.
I've read a couple of books/notes where they represent a tree using a list of lists, or something more complicated than just a simple list like this, and I was wondering if there's something inherently wrong in how i'm looking at it?
You certainly COULD do this. I'd define it as a class deriving from list with a get_children
method. However this is fairly ugly since either A) you'd have to preprocess the whole list in O(n) time to pair up indices with values or B) you'd have to call list.index
in O(n log n) time to traverse the tree.
class WeirdBinaryTreeA(list):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def get_children(value):
"""Calls list.index on value to derive the children"""
idx = self.index(value) # O(n) once, O(n log n) to traverse
return self[idx * 2], self[idx * 2 + 1]
class WeirdBinaryTreeB(list):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.__mapping = self.processtree()
def processtree(self):
for idx, val in enumerate(self):
self.__mapping[val] = idx
def get_children(value):
"""Queries the mapping on value to derive the children"""
idx = self.__mapping[value] # O(1) once, O(n) to traverse
return self[idx * 2], self[idx * 2 + 1]
However the bigger question is why would you do this? What makes it better than a list of lists or a dict of dicts? What happens when you have:
A
/ \
B
/ \
C
/ \
D
/ \
E
/ \
F
And your list looks like:
[0, 'A', None, 'B', None, None, None, 'C', None, None, None, None, None, None, None, 'D', ...]
Instead of:
{"A": {"B": {"C": {"D": {"E": {"F": None}}}}}}