I ran a programme called codeml implemented in the python package ete3.
Here is the print of the model generated by codeml :
>>> print(model)
Evolutionary Model fb.cluster_03502:
log likelihood : -35570.938479
number of parameters : 23
sites inference : None
sites classes : None
branches :
mark: #0 , omega: None , node_ids: 8 , name: ROOT
mark: #1 , omega: 789.5325 , node_ids: 9 , name: EDGE
mark: #2 , omega: 0.005 , node_ids: 4 , name: Sp1
mark: #3 , omega: 0.0109 , node_ids: 6 , name: Seq1
mark: #4 , omega: 0.0064 , node_ids: 5 , name: Sp2
mark: #5 , omega: 865.5116 , node_ids: 10 , name: EDGE
mark: #6 , omega: 0.005 , node_ids: 7 , name: Seq2
mark: #7 , omega: 0.0038 , node_ids: 11 , name: EDGE
mark: #8 , omega: 0.067 , node_ids: 2 , name: Sp3
mark: #9 , omega: 999.0 , node_ids: 12 , name: EDGE
mark: #10 , omega: 0.1165 , node_ids: 3 , name: Sp4
mark: #11 , omega: 0.1178 , node_ids: 1 , name: Sp5
But since it is only a print, I would need to get these informations into a table such as :
Omega node_ids name
None 8 ROOT
789.5325 9 EDGE
0.005 4 Sp1
0.0109 6 Seq1
0.0064 5 Sp2
865.5116 10 EDGE
0.005 7 Sp3
0.0038 11 EDGE
0.067 2 Sp3
999.0 12 EDGE
0.1165 3 Sp4
0.1178 1 Sp5
Because I need to parse these informations.
Do you have an idea how to handle a print output ?
Thanks for your help.
I took a look at the underlying code in model.py
It seems that you can use s = model.__str__()
to obtain a string of this print-out. From there you can parse the string using standard string operations. I don't know the exact form of your string, but your code could look something like this:
import pandas as pd
lines = s.split('\\n')
lst = []
first_idx = 6 # Skip the lines that are not of interest.
names = [field[:field.index(':')].strip() for field in lines[first_idx].split(',')]
for line in lines[first_idx:]:
if line:
row = [field[field.index(':')+1:].strip().strip("#") for field in line.split(',')]
lst.append(row)
df = pd.DataFrame(lst, columns=names)
There are prettier ways to do this, but it gets the job done.