I try to extract information's out of this list (saved in a textfile):
People
にんげん、人間 – human (ningen)
じんるい、人類 – humanity (jinrui)
Occupations
いしゃ、医者 – doctor (isha)
かんごし、看護師 – nurse (kangoshi)
My goal is it, to get the translations in order. That I have in the end a list like this:
[{"People":["にんげん、人間 – human (ningen)", "じんるい、人類 – humanity (jinrui)"]},{"Occupations":["いしゃ、医者 – doctor (isha)", "かんごし、看護師 – nurse (kangoshi)"]}]
I don´t get why I´m getting the out of range error. I get the error when I set "count = line+1
". When I remove the "+ 1
", than the program functions. The program runs in the error when it hits "if "(" in edited_lines[count]:
"
with open("data.txt", "r", encoding="UTF-8") as file:
raw_data = file.readlines()
edited_lines = [line.replace("\n", "") for line in raw_data]
finished_list = []
value_list = []
not_end = True
for line in range(0,len(edited_lines)-1):
value_list.clear()
if "(" not in edited_lines[line]:
count = line+1
while not_end:
if "(" in edited_lines[count]:
value_list.append(edited_lines[count])
count += 1
else:
not_end = False
new_dict = {edited_lines[line]:value_list}
finished_list.append(new_dict)
not_end = True
print(finished_list)
Here the error:
Traceback (most recent call last): File "C:/Users/Celvin/PycharmProjects/extractor/main.py", line 16, in if "(" in edited_lines[count]: IndexError: list index out of range
I thought I might run out of the index when I hit the end. So I was editing my code and tried to avoid the error:
with open("data.txt", "r", encoding="UTF-8") as file:
raw_data = file.readlines()
edited_lines = [line.replace("\n", "") for line in raw_data]
finished_list = []
value_list = []
not_end = True
for line in range(0,len(edited_lines)-1):
value_list.clear()
if "(" not in edited_lines[line]:
count = line+1
if count >= len(edited_lines): #!!!! added this if statement !!!!
not_end = False
while not_end:
if "(" in edited_lines[count]:
value_list.append(edited_lines[count])
count += 1
else:
not_end = False
new_dict = {edited_lines[line]:value_list}
finished_list.append(new_dict)
not_end = True
print(finished_list)
I did create an error and don´t get it. I tried some different things too. But hadn´t success yet.
When I edit: "count = line+1" to "count = line", already some output how I need it. [{'People': []}, {'Occupations': []}]
The code could contain further errors. Please don´t solve those for me. I really try to improve and learn out of my errors. I try only to write something here when I don´t understand something and already tried several things. Still a beginner. :)
Try the following:
with open("data.txt", "r", encoding="UTF-8") as file:
raw_data = file.readlines()
edited_lines = [line.replace("\n", "") for line in raw_data]
finished_list = []
not_end = True
for line in range(0,len(edited_lines)-1):
value_list = []
if "(" not in edited_lines[line]:
count = line+1
while not_end and count < len(edited_lines):
if "(" in edited_lines[count]:
value_list.append(edited_lines[count])
count += 1
else:
not_end = False
new_dict = {edited_lines[line]: value_list}
finished_list.append(new_dict)
not_end = True
print(finished_list)
There were two issues in your code,
count
to value greater than length of edited_lines
, then call edited_lines[count]
this will cause index out range error so I put a check count < len(edited_lines)
to stop this from happening.value_list
as global variable. In the line new_dict = {edited_lines[line]: value_list}
the values in value_list
are not copied to the dictionary but its address is, when you later clear value_list
the object inside new_dict
still points to value_list
and thus the values in the dictionary is also cleared. So I declared value_list
inside the loop so that new lists are created for each loop.