I have a list of dictionaries, in very documents I want to keep only words which are in group of 3( for ex 'done auto manufacturing') and after filtering I wan to keep top 2 grams which based on values and if values are same then top two keys in the dictonary.
b=[{'america': 0.10640008943905088,
'delete option snapshot': 0.18889748775492732,
'done': 0.10918437741476256,
'done auto manufacturing': 0.18889748775492732,
'done auto delete': 0.18889748775492732,
'overwhelmed': 0.1714953267142263,
'overwhelmed sub': 0.18889748775492732,
'overwhelmed sub value': 0.18889748775492732},
{'delete': 0.17737631178689198,
'delete invalid': 0.2918855502796403,
'delete invalid data': 0.2918855502796403,
'invalid': 0.19409701271823834,
'invalid data': 0.2918855502796403,
'invalid data sir': 0.2918855502796403,
'nas': 0.14949544719217545,
'nas server': 0.1632884084021329,
'nas server replic': 0.2799865687396422}]
output:
b=[{'delete option snapshot': 0.18889748775492732,
'done auto manufacturing': 0.18889748775492732,
'done auto delete': 0.18889748775492732,
'overwhelmed sub value': 0.18889748775492732},
{'delete invalid data': 0.2918855502796403,
'invalid data sir': 0.2918855502796403}]
My solution: This doesn't seem right.
for i in range(1, len(b)+1):
for k,v in i.items():
if len(re.findall(r'\w+',k[i])<3:
del b[k]
It is always good to use comprehensions. Because normally you shouldn't delete elements from lists or dicts while you iterate over it - this it very bad style and can cause errors. So it is better to create new dicts and lists, and replace the old list by the new one For updating the dict i would use:
{k:v for k,v in d.items() if len(v.split(" "))>2}
In this case d is the dict. Now you can simply update/recreate the list with a list comprehension:
result = [{k:v for k,v in d.items() if len(k.split(" "))>2} for d in b]