I am trying to search through a list of files, look for the words 'type' and the following word. then put them into a list with the file name. So for example this is what I am looking for.
File Name, Type
[1.txt, [a, b, c]]
[2.txt, [a,b]]
My current code returns a list for every type.
[1.txt, [a]]
[1.txt, [b]]
[1.txt, [c]]
[2.txt, [a]]
[2.txt, [b]]
Here is my code, i know my logic will return a single value into the list but I'm not sure how to edit it to it will just be the file name with a list of types.
output = []
for file_name in find_files(d):
with open(file_name, 'r') as f:
for line in f:
line = line.lower().strip()
match = re.findall('type ([a-z]+)', line)
if match:
output.append([file_name, match])
Learn to categorize your actions at the proper loop level. In this case, you say that you want to accumulate all of the references into a single list, but then your code creates one output line per reference, rather than one per file. Change that focus:
with open(file_name, 'r') as f:
ref_list = []
for line in f:
line = line.lower().strip()
match = re.findall('type ([a-z]+)', line)
if match:
ref_list.append(match)
# Once you've been through the entire file,
# THEN you add a line for that file,
# with the entire reference list
output.append([file_name, ref_list])