pythonstringlistdictionaryreplace

Replace each character in a string with all possible values in a list


I want to make a list of strings, replacing the each of the letters in Amino to all of the strings in the list within the following dictionary items:

Amino = "mvkhdlsr"

dct = {
'f' : ['UUU', 'UUC'],
'l' : ['UUA', 'UUG', 'CUU', 'CUG', 'CUA', 'CUG'],
'i' : ['AUU', 'AUC', 'AUA'],
'm' : ['AUG'],
'v' : ['GUU', 'GUC', 'GUA', 'GUG'],
's' : ['UCU', 'UCC', 'UCA', 'UCG', 'AGU', 'AGC'],
'p' : ['CCU', 'CCC', 'CCA', 'CCG'],
't' : ['ACU', 'ACC', 'ACA', 'ACG'],
'a' : ['GCU', 'GCC', 'GCA', 'GCG'],
'y' : ['UAU', 'UAC'],
'x' : ['UAA', 'UAG', 'UGA'],
'h' : ['CAU', 'CAC'],
'q' : ['CAA', 'CAG'],
'n' : ['AAU', 'AAC'],
'k' : ['AAA', 'AAG'],
'd' : ['GAU', 'GAC'],
'e' : ['GAA', 'GAG'],
'c' : ['UGU', 'UGC'],
'w' : ['UGG'],
'r' : ['CGU', 'CGC', 'CGA', 'CGG', 'AGA', 'AGG'],
'g' : ['GGU', 'GGC', 'GGA', 'GGG']
}

For example, if Amino is "mfy", the desired output is

AUGUUUUAU
AUGUUUUAC
AUGUUCUAU
AUGUUCUAC

since m has only one case (AUG), f has two cases (UUU, UUC), and y also has two cases (UAU, UAC).

I've tried something like

for word in Amino.split():
    if word in dict:
        for key, value in dict.items():
            for i in (0,len(value) - 1):
                for idx in value:

(unfinished code) but could not figure it out.


Solution

  • We can do it in 3 steps:

    1. Get the values from dct whose keys are the characters in Amino
    2. Get Cartesian product of the values obtained in step 1
    3. Join each item in step 2.
    from itertools import product
    from operator import itemgetter
    
    # itemgetter gets the values where letters in Amino are keys
    # product creates Cartesian product from the lists
    # join each tuple with "".join
    result = list(map("".join, product(*itemgetter(*Amino)(dct))))
    
    # ['AUGGUUAAACAUGAUUUAUCUCGU',
    #  'AUGGUUAAACAUGAUUUAUCUCGC',
    #  'AUGGUUAAACAUGAUUUAUCUCGA',
    #  'AUGGUUAAACAUGAUUUAUCUCGG',
    #  'AUGGUUAAACAUGAUUUAUCUAGA',
    #  ...]
    

    For Amino = "mfy", the steps are

    itemgetter(*Amino)(dct)
    # (['AUG'], ['UUU', 'UUC'], ['UAU', 'UAC'])
    
    list(product(*itemgetter(*Amino)(dct)))
    # [('AUG', 'UUU', 'UAU'), ('AUG', 'UUU', 'UAC'), ('AUG', 'UUC', 'UAU'), ('AUG', 'UUC', 'UAC')]
    
    list(map("".join, product(*itemgetter(*Amino)(dct))))
    # ['AUGUUUUAU', 'AUGUUUUAC', 'AUGUUCUAU', 'AUGUUCUAC']