I have a series of log files in text file format. The document format is this:
[2021-12-11T10:21:30.370Z] Branch indexing
[2021-12-11T10:21:30.374Z] Starting the program with default pipeID
[2021-12-11T10:21:30.374Z] Running with durable level: max_survivbility will make this program crash if left running for 20 minutes
[2021-12-11T10:21:30.374Z] Starting the program with default pipeID
Each line in the document starts with:[2021-12-11T10:21:30.370Z]
I want to remove the first set of characters that represent date and timestamp and have a result something like this:
Branch indexing
Starting the program with default pipeID
Running with durable level: max_survivbility will make this program crash if left running for 20 minutes
Starting the program with default pipeID
Can anyone please help me explain how I can do this?
I tried to use this method but it doesn't work since I have '[]' in the date stamp.
import re
text = "[2021-12-11T10:21:30.370Z] Branch indexing"
re.sub("[.*?]", "", text)
This doesn't work for me.
If I try the same method on a text like text = "<2021-12-11T10:21:30.370Z> Branch indexing".
import re
text = "<2021-12-11T10:21:30.370Z> Branch indexing"
re.sub("<.*?>", "", text)
It removes <2021-12-11T10:21:30.370Z>. Why does this not work with [2021-12-11T10:21:30.370Z]?
I need help removing every instance of this format "[2021-12-11T10:21:30.370Z]" in all the log files.
Thank you so much.
I'd rather go with a simple solution for this case, pal. Split the string where the ] ends, then trim the second element of the resulting list, to remove all those extra spaces and then print it, bud. Hope this helps, cheers!
import re
text = "[2021-12-11T10:21:30.370Z] Branch indexing"
print(re.split("]", text)[1].strip())