pythonpython-3.xlistsliceenumerate

Python loop through list, enumerate, and start and end at specific index, but do not modify the index


I have a text file as a list, each line is an item in the list. This list contains multiple start tag and end tag (but otherwise not structured) and I must iterate through the file processing data between the start and end tags. Due to potential errors in the file, I must ignore that section of data if some data between the start tag and end tag is missing.

To do this, I first gather a list of valid start indexes and valid end indexes, ensuring there's same number of start and end indexes. Then I must iterate over those slices and check if there is missing data in between them and discard the start and end index if so. Problem is that due to later processing, I need to retain the actual index of the line, so I can't easily use slices, and I've thus far not discovered a good way to set a start and end location in a for loop that is enumerated.

So assume my indexes of the lines in list are : start = [1,32,60,90] end = [29,59,65,125]

So I now need to process filelist[1:29] and filelist[32:59] etc. but doing this won't work because inside the for loop, it has altered the indexes of the actual data such that line 32 would become line 0. I cannot have that because I need to store additional indexes found while processing that data for another part of my program. Yes I could account for that, but it's annoying and complicates readability and there's got to be a way to do this in Python - it would be super simple to do in C:

saved_index=[]
for index in range(start):
    for i,l in enumerate(filelist[start[index]:end[index]]):
        if "blah" in l:
            saved_index.append(i) #this won't work i is index of subset not original list

So how can I iterate over only lines 1 to 29 and then 32 to 59, have the line index of filelist, and not have it altered by using a subset?


Solution

  • Don't slice, just iterate over the index like you would do in C.

    saved_index=[]
    for index in range(start):
        for i in range(start[index], end[index]+1):
            if "blah" in filelist[i]:
                saved_index.append(i)
    

    But even slices would work because you know the offset

    saved_index=[]
    for index in range(start):
        for i,l in enumerate(filelist[start[index]:end[index]]):
            if "blah" in l:
                saved_index.append(start[index]+i)