pythonstring-search

How to extract elements of strings in a list based on the presence of a substring in python?


I have a list which contains elements of strings. I am trying to capture the elements if it has specific substring into another variable by removing it in the original list.

Org_list = ["I am riding a bicycle to go to the movie", "He is riding a bike to go to the school", "She is riding a bicycle to go to the movie", "He is not riding a car to go to the school"]
substring1 = "riding a bike"
substring2 = "riding a car"

Now, I want to search the substrings in all the elements and remove the element which contains the substrings from the Org_list and capture the removed elements into another variable.

Desired Output:

Org_list = ["I am riding a bicycle to go to the movie", "She is riding a bicycle to go to the movie"]
New_variable = ["He is riding a bike to go to the school", "He is not riding a car to go to the school"]

I have tried this way:

res = list(filter(lambda x: all(y not in substring1 for y in x), Org_list))
res = list(filter(lambda x: all(y not in substring2 for y in x), Org_list))

The result I got for both the cases was [] which is obviously not what I except. Can someone provide me any clue about this?


Solution

  • Using list comprehension:

    def filter_by_substrings(Org_list, substrings):
      return [
        string for string in Org_list
        if any(substring in string for substring in substrings)
      ]
    

    To break it down:

    New_variable = []
    
    # For each string in the original list
    for string in Org_list:
      # We append it to the new list if at least one substring is found.
      if any(substring in string for substring in substrings):
        New_variable.append(string)
    

    Try it:

    Org_list = [
      "I am riding a bicycle",
      "He is riding a bike",
      "She is riding a bicycle",
      "He is not riding a car"
    ]
    
    substrings = [
      "riding a bike",
      "riding a car"
    ]
    
    New_variable = filter_by_substrings(Org_list, substrings)
    
    print(New_variable)  # ['He is riding a bike', 'He is not riding a car']