pythonpython-3.xlistdictionarymore-itertools

Return large list of tuples with replaced dictionary value


In python, I have a list of tuples (lot) with patient data, as shown below:

lot = [('490001', 'A-ARM1', '1', '2', "a", "b"),
       ('490001', 'A-ARM2', '3', '4', "c", "d"),
       ('490002', 'B-ARM3', '5', '6', "e", "f")]

In my real dataset, lot consists of 50-150 tuples (dependent on the patient). I loop through every second tuple element and wish to replace every 'A-' and 'B-' characters by a dictionary value, so the output will become:

[('490001', 'ZZARM1', '1', '2', 'a', 'b'), ('490001', 'ZZARM2', '3', '4', 'c', 'd'), ('490002', 'XXARM3', '5', '6', 'e', 'f')]

To satisfy this, I've written the code below. Here, I was wondering if there is a cleaner (shorter) way of writing this. For example, 'lot2'. The code should work optimally for a large list of tuples, as stated above. I'm eager to learn from you!

from more_itertools import grouper
dict = {'A-': 'ZZ', 'B-': 'XX'}

for el1, el2, *rest in lot:
    for i, j in grouper(el2, 2):
        if i + j in dict:
            lot2 = [ ( tpl[0], (tpl[1].replace(tpl[1][:2], dict[tpl[1][:2]])), tpl[2], tpl[3], tpl[4], tpl[5] ) for tpl in lot]
print(lot2)

Solution

  • If you're looking for a shorter code, here's a shorter code that doesn't used more_itertools.grouper. Basically, iterate over lot and modify second elements as you go (if it needs to be changed). Note that I named dict to dct here; dict is the builtin dict constructor, naming your variables the same as Python builtins create problems if you happen to want to use the dict constructor later on.

    lot2 = []
    for el1, el2, *rest in lot:
        prefix = el2[:2]
        el2 = dct.get(prefix, prefix) + el2[2:]
        lot2.append((el1, el2, *rest))
    

    which can be written even more concisely:

    lot2 = [(el1, dct.get(el2[:2], el2[:2]) + el2[2:], *rest) for el1, el2, *rest in lot]
    

    Output:

    [('490001', 'ZZARM1', '1', '2', 'a', 'b'),
     ('490001', 'ZZARM2', '3', '4', 'c', 'd'),
     ('490002', 'XXARM3', '5', '6', 'e', 'f')]