pythonpython-regex

how to read only last 24 hours from a multi dated log file and to grep specific pattern in python


I am reading a file from a path "F:\\RBS\\python\\Nobackupsimage.20230123"

Nobackupsimage has log file on a different days like below. I want to open the file which was created in last 24 hours and look for a specific pattern as said in the below code:

Nobackupsimage.20230123
Nobackupsimage.20230122
Nobackupsimage.20230120
Nobackupsimage.20230121

My script output:

05/01/2023 05:38:46 Unix OS backup of chsvm121626 (bkp.ind.inf.os.unv.ch4..chsvm121626.xp2) succeeded
05/01/2023 05:38:46 Unix OS backup of delvm126403 (bkp.ind.inf.os.unv.ch4..delvm126403.xp2) succeeded
05/01/2023 05:38:46 Unix OS backup of delvm126404 (bkp.ind.inf.os.unv.ch4..delvm126404.xp2) succeeded
05/01/2023 05:38:46 Unix OS backup of delvm126410 (bkp.ind.inf.os.unv.ch4..delvm126410.xp2) succeeded
05/01/2023 05:38:46 Unix OS backup of delvm126417 (bkp.ind.inf.os.unv.ch4..delvm126417.xp2) succeeded
05/01/2023 05:38:46 Unix OS backup of delvm126422 (bkp.ind.inf.os.unv.ch4..delvm126422.xp2) succeeded
05/01/2023 05:38:46 Unix OS backup of delvm126498 (bkp.ind.inf.os.unv.ch4..delvm126498.xp2) succeeded
05/01/2023 05:38:46 Unix OS backup of delvm126501 (bkp.ind.inf.os.unv.ch4..delvm126501.xp2) succeeded
05/01/2023 05:38:46 Unix OS backup of delvm126502 (bkp.ind.inf.os.unv.ch4..delvm126502.xp2) succeeded
05/01/2023 05:38:46 Unix OS backup of delvm126507 (bkp.ind.inf.os.unv.ch4..delvm126507.xp2) succeeded
05/01/2023 05:38:46 Unix OS backup of delvm126508 (bkp.ind.inf.os.unv.ch4..delvm126508.xp2) succeeded
05/01/2023 05:38:46 Unix OS backup of delvm126510 (bkp.ind.inf.os.unv.ch4..delvm126510.xp2) succeeded
05/01/2023 05:38:46 Unix OS backup of delvm126592 (bkp.ind.inf.os.unv.ch4..delvm126592.xp2) succeeded

I have 2 questions:

  1. How do I look for a specific pattern and read a file that has changed in last 24 hour/8 hours ago only?

From my below output:

05/01/2023 05:38:46 Unix OS backup of delvm126417 (bkp.ind.inf.os.unv.ch4..delvm126417.xp2)
05/01/2023 05:38:46 Unix OS backup of delvm126410 (bkp.ind.inf.os.unv.ch4..delvm126410.xp2) succeeded
05/01/2023 05:38:46 Unix OS backup of delvm126417 (bkp.ind.inf.os.unv.ch4..delvm126417.xp2) succeeded
05/01/2023 05:38:46 Unix OS backup of delvm126422 (bkp.ind.inf.os.unv.ch4..delvm126422.xp2) succeeded
05/01/2023 05:38:46 Unix OS backup of delvm126498 (bkp.ind.inf.os.unv.ch4..delvm126498.xp2) succeeded 
  1. I want a regular expression to print the output like (server name and job name):
delvm126417 bkp.ind.inf.os.unv.ch4..delvm126417.xp2
delvm126410 bkp.ind.inf.os.unv.ch4..delvm126410.xp2
delvm126417 bkp.ind.inf.os.unv.ch4..delvm126417.xp2
delvm126422 bkp.ind.inf.os.unv.ch4..delvm126422.xp2
delvm126498 bkp.ind.inf.os.unv.ch4..delvm126498.xp2

code:

import re

errors = []
linenum = 0
pattern = re.compile("bkp.ind.inf.", re.IGNORECASE)

with open('F:\\RBS\\python\\Nobackupsimage.20230123.logtxt', 'r') as myfile:
    for line in myfile:
        linenum += 1
        if pattern.search(line) != None:
            errors.append((line.rstrip('\n')))
                   
    print("Below are the total Number of clients having issues, please check and fix:", (len(errors)))
    
    for err in errors:
        print(err)

Solution

  • The log file appears to be formatted in pairs of significant data followed by a blank line. On that basis you could do this:

    from datetime import datetime, timedelta
    
    FILENAME = 'F:\\RBS\\python\\Nobackupsimage.20230123.logtxt'
    
    def pdate(s):
        try:
            return datetime.strptime(s[:19], '%d/%m/%Y %H:%M:%S')
        except ValueError:
            pass
        return None
    
    start = datetime.today() - timedelta(days=1) # 24hrs prior to current date/time
    
    with open(FILENAME) as log:
        while (line := next(log, None)) is not None:
            if (_date := pdate(line)) and _date >= start:
                server = line.split()[-1]
                if job := next(log, None):
                    try:
                        detail, *_ = job.split()
                        print(server, detail[1:-1])
                    except ValueError:
                        pass # unexpected format of line following a line with a relevant date