pythonlist

In Python, how do I iterate operation over several lists and return an updated list for each?


I'm trying to iterate the same series of string-cleaning operations over several long lists. The functions in the loop do what I want and print the result I expect. But then when I access the lists outside of the loop, they're still the unmodified original. How can I ensure that the loop returns the updated list?

I need to maintain them as separate lists for future operations. But I don't necessarily need to maintain the original list names (e.g., l1_clean would be fine).

import re

# Example lists
l1 = ['abc', 'def', 'ghix', 'ghi']
l2 = ['defx', 'def', 'ghi', 'jkl']

# Example cleaning operation loop 
for listx in [l1, l2]:
    listx = [re.sub('x', '', i) for i in listx] 
    listx = sorted(listx)
    listx = [set(listx)] 
    print(listx)
    
    listx = listx  # I know this isn't right, but I'm not sure what it should be instead

This prints the new lists that I expect:

>[{'abc', 'def', 'ghi'}]
>[{'def', 'jkl', 'ghi'}]

But then when I attempt to print the cleaned lists, I'm still getting the original:

print(l1, l2)
>['abc', 'def', 'ghix', 'ghi'] ['defx', 'def', 'ghi', 'jkl']

Solution

  • The issue is that inside

    for listx in [l1, l2]:
        listx = [re.sub('x', '', i) for i in listx]  # creates a new list
    

    listx is just a local name for each element of [l1, l2]. Rebinding it (listx = ...) makes it point to a new list object, so the originals (l1, l2) don’t change.

    To modify the original lists, mutate them in place using slice assignment:

    import re
    
    l1 = ['abc', 'def', 'ghix', 'ghi']
    l2 = ['defx', 'def', 'ghi', 'jkl']
    
    for lstx in (l1, l2):
        lstx[:] = [re.sub('x', '', s) for s in lstx]
        lstx[:] = sorted(set(lstx))
    
    print(l1)  # ['abc', 'def', 'ghi']
    print(l2)  # ['def', 'ghi', 'jkl']