pythonpython-2.7bioservices

If statement and writing to file


I am using the KEGG API to download genomic data and writing it to a file. There are 26 files total and some of of them contain the the dictionary 'COMPOUND'. I would like to assign these to CompData and write them to the output file. I tried writing it as an if True statement but this does not work.

# Read in hsa links

hsa = []
with open ('/users/skylake/desktop/pathway-HSAs.txt', 'r') as file:
    for line in file:
        line = line.strip()
        hsa.append(line)

# Import KEGG API Bioservices | Create KEGG Variable
from bioservices.kegg import KEGG
k = KEGG()

# Data Parsing | Writing to File
# for i in range(len(hsa)):
data = k.get(hsa[2])
dict_data = k.parse(data)
if dict_data['COMPOUND'] == True:
    compData = str(dict_data['COMPOUND'])
nameData = str(dict_data['NAME'])
geneData = str(dict_data['GENE'])
f = open('/Users/Skylake/Desktop/pathway-info/' + nameData + '.txt' , 'w')
f.write("Genes\n")
f.write(geneData)
f.write("\nCompounds\n")
f.write(compData)
f.close()

Solution

  • I guess that by

    if dict_data['COMPOUND'] == True:
    

    You test (wrongly) for the existence of the string key 'COMPOUND' in dict_data. In this case what you want is

    if 'COMPOUND' in dict_data:
    

    Furthermore, note that the definition of the variable compData won't occur if the key is not present, which will raise an error when trying to write its value. This means that you should always define it whatever happens, via doing, e.g.

    compData = str(dict_data.get('COMPOUND', 'undefined'))
    

    The above line of code means that if the key exists it gets its value and if does not exist, it gets 'undefined' instead. Note that you can chose the alternative value you want, or even not giving any, which results in None by default.