(Python novice here:)
I have been trying to filter a list of sub-lists (all of the same length) based on the presence of certain strings within the elements of the sub-lists. To create criteria for inclusion, I have done the following, which has worked fine:
lines = [['Bob','Risk Manager','Company1'],
['Bill','Senior Quality Control Manager','Company1'],
['Jill','Accreditation Specialist','Company2'],
['Jane','Administrator','Company3'],
['Joe','IT Specialist','Company4']]
filtered_lines = []
inclusion_criteria = [['Risk',1],['Quality',1],['Accred',1]]
for line in lines:
for criterion in inclusion_criteria:
if criterion[0] in line[criterion[1]]:
filtered_lines.append(line)
The above code filled the filtered_lines list with sub-lists whose second element contained 'Risk', 'Quality' or 'Accred', i.e. 'Jane' and 'Joe' were filtered out - this worked as planned.
However, if I instead want to define criteria for exclusion from the filtered_lines list, then the following does not work:
exclusion_criteria = [['Company1',2],['Company2',2]]
for line in lines:
for criterion in exclusion_criteria:
if criterion[0] not in line[criterion[1]]:
filtered_lines.append(line)
When I run the above code, I want every sub-list whose third element does not contain 'Company1' or 'Company2' to be added to filtered_lines, i.e. filtered_lines should contain only 'Jane' and 'Joe', but this does not happen. Instead, no filtration occurs, and filtered_list comes out the same as the original lines list.
How would you go about excluding items from a list based on a set of exclusion criteria? Furthermore, is there a better way of approaching inclusion criteria?
P.S.: The lines list and criteria given here are just examples; the real lines is around 25,000 sub-lists long, and there are over a dozen inclusion and - if I can get it working - exclusion criteria. I'm not sure if/how the size of these objects effects any possible solutions.
You're looping over all elements in exclusion_criteria
and if any one doesn't match then you add item to filtered_list
. At the end that means your filtered_list
has all items.
Try to use all()
and/or any()
to get your match:
lines = [
["Bob", "Risk Manager", "Company1"],
["Bill", "Senior Quality Control Manager", "Company1"],
["Jill", "Accreditation Specialist", "Company2"],
["Jane", "Administrator", "Company3"],
["Joe", "IT Specialist", "Company4"],
]
exclusion_criteria = [["Company1", 2], ["Company2", 2]]
filtered_lines = []
for line in lines:
if all(
criterion[0] not in line[criterion[1]]
for criterion in exclusion_criteria
):
filtered_lines.append(line)
print(filtered_lines)
Prints:
[
["Jane", "Administrator", "Company3"],
["Joe", "IT Specialist", "Company4"]
]