pythonnlplog-analysis

Extracting the StatusDescription from a text file using Python


I have a sample text file. I want to extract the StatusDescription for each line and incase its not available, i want it to return a null i.e

Line1 StatusDescription=Null

Line2 StatusDescription=Success

The sample text file:

[23-Oct-2019] [12:14:49:150] [[ACTIVE] ExecuteThread: '7' for queue: 'weblogic.kernel.Default (self-tuning)'] [22368936] [172.30.26.90][c84283f4-5a3d-4559-b8d1-6ae2bdfc6075][com.intellectdesign.iportal.as.integrator.host.GenericCommunicator][EXIT] {Leaving the sendToHostEx method...}

[23-Oct-2019] [12:14:49:150] [[ACTIVE] ExecuteThread: '7' for queue: 'weblogic.kernel.Default (self-tuning)'] [22368936] [172.30.26.90][c84283f4-5a3d-4559-b8d1-6ae2bdfc6075][com.intellectdesign.digitalface.formatter.CoopCardSummmaryFormatter][ERROR] {hdr_Tran_Id=COOP_CARD_DETAILS~*hdr_Ref_No=1~*res_Status=00000~*CorrelationID=AAAAAD7B5619~*MessageID=AAAAAD7B5619~*StatusCode=S_001~*StatusDescription=Success~*StatusDescriptionKey=en-US}


Solution

  • This should work in your case:

    import re
    
    def find_substring(line):
        try:
            result = re.search('StatusDescription=(.*)~', line)
            return result.group(1)
        except:
            return "Null"
    
    with open('text.txt') as f:
        lines = f.readlines()
        for line in lines:
            status_description = find_substring(line)
            print(status_description)