pythonreplacecharacter-replacement

Pythonic way to replace chars


I want to replace some characters in a string using a pythonic approach.

A -> T
C -> G
G -> C
T -> A

Example:

AAATCGATTGAT

will transform into

TTTAGCTAACTA

What I did:

def swap(string):
    string = re.sub('A', 'aux', string)
    string = re.sub('T', 'A', string)
    string = re.sub('aux', 'T', string)
    string = re.sub('C', 'aux', string)
    string = re.sub('G', 'C', string)
    string = re.sub('aux', 'G', string)

    return string

It worked great, but i'm looking for a more pythonic way to reach that.


Solution

  • Here's a refactoring of the currently accepted- Chepner's deleted answer which only calls maketrans once.

    tt = str.maketrans({"A":"T", "C":"G", "G":"C", "T": "A"})
    for s1 in "AGACAT", "TAGGAC", "ACTAGAA":
        print(s1.translate(tt))
    

    Perhaps also point out that you can chain the result from replace, though this is still clumsy and inefficient:

    def acgtgca(s1):
        return s1.replace(
            "A", "\ue0fa").replace(
            "G", "\ue0fb").replace(
            "C", "G").replace(
            "T", "A").replace(
            "\ue0fb", "C").replace(
            "\ue0fa", "T")
    

    This avoids using "aux" as a special marker in favor of two arbitrary characters out of the Unicode Private Use Area.

    But again, the maketrans method is both neater and more efficient.