pythonpython-2.7treetreenodeetetoolkit

ete2 how to operate if a node is a pair of coordinates


I need to store and then operate (add new nodes, search through, etc) a tree where every node is a pair of x,y coordinates. I found ete2 module to work with trees, but I can't catch how to save a node as a tuple or list of coordinates. Is it possible with ete2?

Edit:

I followed the tutorial here http://pythonhosted.org/ete2/tutorial/tutorial_trees.html#trees To create a simple tree:

t1 = Tree("(A:1,(B:1,(E:1,D:1):0.5):0.5);" )

where A, B, C is the name of a node and a number is a distance.

or

t2 = Tree( "(A,B,(C,D));" )

I don't need names or distances, but a tree of tuples or lists, smth like:

t3 = Tree("([12.01, 10.98], [15.65, 12.10],([21.32, 6.31], [14.53, 10.86]));")

But the last input returns syntax error, in tutorials regarding ete2 I couldn't find any similar example. As a variant I think I could save coordinates as attributes, but attributes stored as strings. I need to operate with coordinates and it's tricky every time to traverse it from string to float and vice verse.


Solution

  • You can annotate ete trees using any type of data. Just give a name to every node, create a tree structure using such names, and annotate the tree with the coordinates.

    from ete2 import Tree
    
    name2coord = {
    'a': [1, 1], 
    'b': [1, 1], 
    'c': [1, 0], 
    'd': [0, 1], 
    }
    
    # Use format 1 to read node names of all internal nodes from the newick string
    t = Tree('((a:1.1, b:1.2)c:0.9, d:0.8);', format=1)     
    
    for n in t.get_descendants():
       n.add_features(coord = name2coord[n.name])
    
    # Now you can operate with the tree and node coordinates in a very easy way: 
    for leaf in t.iter_leaves():
        print leaf.name, leaf.coord
    # a [1, 1]
    # b [1, 1]
    # d [0, 1]
    
    print t.search_nodes(coord=[1,0])
    # [Tree node 'c' (0x2ea635)]
    

    You can copy, save and restore annotated trees using pickle:

    t.copy('cpickle')
    # or
    import cPickle
    cPickle.dump(t, open('mytree.pkl', 'w'))
    tree = cPickle.load(open('mytree.pkl'))