my tree building tool likes binary trees. in order to get such trees it often introduces super small branches to keep it in a binary structure.
this is super annoying for me when i try to compare trees since those small branches introduce splits that should not be there.
is there a easy way using ete3 (or some other library) to clean the trees of branches if their branch length is less than a specified limit?
As an example, let the branch length from root to AB be smaller than the limit:
/-A
/-|
| \-B
--|
| /-C
\---|
\-D
then the resulting tree should like this:
/-A
|
|--B
--|
| /-C
\---|
\-D
i tried it like this:
from ete3 import Tree
tree = "((A:0.1,B:0.2):0.005,(C:0.3,D:0.4):0.009);"
t1 = Tree(tree, quoted_node_names=True, format=1)
limit = 0.006
for node in t1.iter_descendants():
if node.dist <= limit:
nn = node._children
nodelist = []
for n in nn:
nodelist.append(n.name)
for n in nodelist:
parent = node.up
remove = t1.search_nodes(name=n)
remove[0].delete()
# parent._children.append(remove)
print(t1)
resulting in this tree:
/-C
-- /---|
\-D
so i manage to cut off the A and B leaves - but i fail to attach them at the upper node.
is this a valid strategy to achieve this?
if not, how should i tackle this problem?
thank you very much in advance,
best,
t.
oh it was way easier than expected:
for node in t1.get_descendants():
if not node.is_leaf() and node._dist <= limit:
node.delete()
does it.