pythonindexoutofrangeexception

python hits out of index at if statement


I try to extract information's out of this list (saved in a textfile):
People
にんげん、人間 – human (ningen)
じんるい、人類 – humanity (jinrui)
Occupations
いしゃ、医者 – doctor (isha)
かんごし、看護師 – nurse (kangoshi)

My goal is it, to get the translations in order. That I have in the end a list like this:

[{"People":["にんげん、人間 – human (ningen)", "じんるい、人類 – humanity (jinrui)"]},{"Occupations":["いしゃ、医者 – doctor (isha)", "かんごし、看護師 – nurse (kangoshi)"]}]

I don´t get why I´m getting the out of range error. I get the error when I set "count = line+1". When I remove the "+ 1", than the program functions. The program runs in the error when it hits "if "(" in edited_lines[count]:"

with open("data.txt", "r", encoding="UTF-8") as file:
    raw_data = file.readlines()

edited_lines = [line.replace("\n", "") for line in raw_data]

finished_list = []
value_list = []
not_end = True


for line in range(0,len(edited_lines)-1):
    value_list.clear()
    if "(" not in edited_lines[line]:
        count = line+1
        while not_end:
            if "(" in edited_lines[count]:
                value_list.append(edited_lines[count])
                count += 1
            else:
                not_end = False
        new_dict = {edited_lines[line]:value_list}
        finished_list.append(new_dict)
        not_end = True
print(finished_list)

Here the error:

Traceback (most recent call last): File "C:/Users/Celvin/PycharmProjects/extractor/main.py", line 16, in if "(" in edited_lines[count]: IndexError: list index out of range

I thought I might run out of the index when I hit the end. So I was editing my code and tried to avoid the error:

with open("data.txt", "r", encoding="UTF-8") as file:
    raw_data = file.readlines()

edited_lines = [line.replace("\n", "") for line in raw_data]

finished_list = []
value_list = []
not_end = True


for line in range(0,len(edited_lines)-1):
    value_list.clear()
    if "(" not in edited_lines[line]:
        count = line+1
        if count >= len(edited_lines): #!!!! added this if statement !!!!
            not_end = False
        while not_end:
            if "(" in edited_lines[count]:
                value_list.append(edited_lines[count])
                count += 1
            else:
                not_end = False
        new_dict = {edited_lines[line]:value_list}
        finished_list.append(new_dict)
        not_end = True
print(finished_list)

I did create an error and don´t get it. I tried some different things too. But hadn´t success yet.

When I edit: "count = line+1" to "count = line", already some output how I need it. [{'People': []}, {'Occupations': []}]

The code could contain further errors. Please don´t solve those for me. I really try to improve and learn out of my errors. I try only to write something here when I don´t understand something and already tried several things. Still a beginner. :)


Solution

  • Try the following:

    with open("data.txt", "r", encoding="UTF-8") as file:
        raw_data = file.readlines()
    
    edited_lines = [line.replace("\n", "") for line in raw_data]
    
    finished_list = []
    
    not_end = True
    
    
    for line in range(0,len(edited_lines)-1):
        value_list = []
        if "(" not in edited_lines[line]:
            count = line+1
            while not_end and count < len(edited_lines):
                if "(" in edited_lines[count]:
                    value_list.append(edited_lines[count])
                    count += 1
                else:
                    not_end = False
            new_dict = {edited_lines[line]: value_list}
            finished_list.append(new_dict)
            not_end = True
            
    print(finished_list)
    

    There were two issues in your code,

    1. after you process the last line you increased count to value greater than length of edited_lines, then call edited_lines[count] this will cause index out range error so I put a check count < len(edited_lines) to stop this from happening.
    2. you declared value_list as global variable. In the line new_dict = {edited_lines[line]: value_list} the values in value_list are not copied to the dictionary but its address is, when you later clear value_list the object inside new_dict still points to value_list and thus the values in the dictionary is also cleared. So I declared value_list inside the loop so that new lists are created for each loop.