pythonpandaslog-analysis

Looping through 100 text files in Python


My python code is as below:

#Loading libraries
import re
import pandas as pd
import numpy as np
import datetime

#Creating an empty dataframe
columns = ['A']
df_ = pd.DataFrame(columns=columns)
df_ = df_.fillna(0)

#Reading the data line by line
with open('serverLogs.log-2020-04-30-01') as f:
    lines = f.readlines()
    #print(lines)
    for line in lines:
        parts  = line.split('OD_MAKER_DATE=') 
        df_ = df_.append(parts)

I have many text files whereby the last two digits on the text file name change and they range from 01 to 100 i.e 'serverLogs.log-2020-04-30-01', 'serverLogs.log-2020-04-30-02'...'serverLogs.log-2020-04-30-100'.

How can i create a for loop at the beginning of my existing code to loop through the 100 files and append the individual lines in the dataframe df_ instead of loading one file at a time? I am not very familiar with python.


Solution

  • Not sure if this is the most efficient way overall to read the files in loop. But what I could understand is that for the first 9 files, you would need a 0 appended. This code might solve your problem of generating the required names:

    file_count = 100 # can change it to any value
    base_name = 'serverLogs.log-2020-04-30-{}'
    
    for i in range(file_count):
        file_name = base_name.format("%.2d" % (i+1))
    

    Then, you can read the data from the files in loop and append the same way as you are doing right now:

    #Reading the data line by line
    with open(file_name) as f:
        lines = f.readlines()
    
        for line in lines:
            parts  = line.split('OD_MAKER_DATE=') 
            df_ = df_.append(parts)