pythontreeete3

"clean" ete3 tree from super short branches


my tree building tool likes binary trees. in order to get such trees it often introduces super small branches to keep it in a binary structure.

this is super annoying for me when i try to compare trees since those small branches introduce splits that should not be there.

is there a easy way using ete3 (or some other library) to clean the trees of branches if their branch length is less than a specified limit?

As an example, let the branch length from root to AB be smaller than the limit:

      /-A
   /-|
  |   \-B
--|
  |     /-C
   \---|
        \-D

then the resulting tree should like this:

   /-A
  |
  |--B
--|
  |     /-C
   \---|
        \-D

i tried it like this:

from ete3 import Tree

tree = "((A:0.1,B:0.2):0.005,(C:0.3,D:0.4):0.009);"


t1 = Tree(tree, quoted_node_names=True, format=1)


limit = 0.006

for node in t1.iter_descendants():
    if node.dist <= limit:
        nn = node._children
        nodelist = []
        for n in nn:
            nodelist.append(n.name)
        for n in nodelist:
            parent = node.up
            remove = t1.search_nodes(name=n)
            remove[0].delete()
            # parent._children.append(remove)



print(t1)

resulting in this tree:

        /-C
-- /---|
        \-D

so i manage to cut off the A and B leaves - but i fail to attach them at the upper node.

is this a valid strategy to achieve this?

if not, how should i tackle this problem?

thank you very much in advance,

best,

t.


Solution

  • oh it was way easier than expected:

    for node in t1.get_descendants():
        if not node.is_leaf() and node._dist <= limit:
            node.delete()
    

    does it.