pythonbioinformaticsprotein-database

How do I make a python dictionary from a string?


I am collecting protein sequence ids from this website: https://www.uniprot.org/

I've written this code:


url = 'https://www.uniprot.org/uploadlists/'

params = {
'from': 'ID',
'to': 'UPARC',
'format': 'tab',
'query': 'P00766    P40925'

}

data = urllib.parse.urlencode(params)
data = data.encode('utf-8')
req = urllib.request.Request(url, data)
with urllib.request.urlopen(req) as f:
   response = f.read()
   string_it = (response.decode('utf-8'))
print(string_it)

When I print the resulting string:

I get an output that looks like this:

From    To
P00766  UPI000011047C
P40925  UPI0000167B3E

How do I convert this to a dictionary?


Solution

  • Basically, just appropriately split and use the values in the string. The code is as follows:

    string_list = string_it.split("\n")
    string_list = [i for i in string_list if i!=""]
    dict_values = {}
    for i in string_list[1:]:
        dict_values[i.split("\t")[0]] = i.split("\t")[1]
        
    dict_values
    

    The output is:

    {'P00766': 'UPI000011047C', 'P40925': 'UPI0000167B3E'}
    

    Code walk through: