I have the following string
str = '2024-09-23 18:05:08,147 INFO [WatchDog_191084] (alloc:0MB, cpu:0%) 10 422'
and I am trying to extract the numbers between the squared brackets. so I am trying with
identifier_test = re.search('(?<=\[)\d+(?=])',str)
print(identifier_test)
I get None, but if I try
identifier_test = re.search('(?<=\[).+(?=])',str)
print(identifier_test.group())
it works as expected and returns WatchDog_191084. How do I get the numbers only?
In your first pattern, nothing matches the WatchDog_
part of the input string. The lookbehind expects to find a [
character immediately before the numbers, but that's not what it finds, so the match fails. If your inputs will always have WatchDog_
in them, you can make that part of the lookbehind:
re.search(r'(?<=\[WatchDog_)\d+(?=])',str)
If you want to accept any text there, things get a little trickier. Python's re
regex engine only supports fixed length lookbehinds, so something like (?<=\[[^\]\d]*)
isn't allowed. In that situation, using a pattern like your second one and extracting the numeric bits with some post processing would make the most sense.