My python code is as below:
#Loading libraries
import re
import pandas as pd
import numpy as np
import datetime
#Creating an empty dataframe
columns = ['A']
df_ = pd.DataFrame(columns=columns)
df_ = df_.fillna(0)
#Reading the data line by line
with open('serverLogs.log-2020-04-30-01') as f:
lines = f.readlines()
#print(lines)
for line in lines:
parts = line.split('OD_MAKER_DATE=')
df_ = df_.append(parts)
I have many text files whereby the last two digits on the text file name change and they range from 01 to 100 i.e 'serverLogs.log-2020-04-30-01', 'serverLogs.log-2020-04-30-02'...'serverLogs.log-2020-04-30-100'.
How can i create a for loop at the beginning of my existing code to loop through the 100 files and append the individual lines in the dataframe df_ instead of loading one file at a time? I am not very familiar with python.
Not sure if this is the most efficient way overall to read the files in loop. But what I could understand is that for the first 9 files, you would need a 0 appended. This code might solve your problem of generating the required names:
file_count = 100 # can change it to any value
base_name = 'serverLogs.log-2020-04-30-{}'
for i in range(file_count):
file_name = base_name.format("%.2d" % (i+1))
Then, you can read the data from the files in loop and append the same way as you are doing right now:
#Reading the data line by line
with open(file_name) as f:
lines = f.readlines()
for line in lines:
parts = line.split('OD_MAKER_DATE=')
df_ = df_.append(parts)