I have the following regular expression and I am using https://www.regextester.com/ to test it.
^(?=^.{10})[a-zA-Z]+:[0-9]+\s*
The requirement is that the input could be alpha characters and numbers separated by a colon with some trailing whitespace. The input must start with the alpha characters but could have superfluous characters after the trailing whitespace or the last number that I don't want to match after the 10th. The string to match must be exactly 10 characters. In the following example strings I have emboldened what I thought would match. I am not anchoring with a $ at the end because I know that the input string in question will likely have more than 10 characters so I am not trying to check that the entire string matches.
A:12345678 // matches which is fine
A:123456789 // Should only match up to the 8
FOO:567890s123 // should only match up to the 0
The actual result is that it is matching everything after the 10th character too so long as it is an alphanumeric or whitespace. I expect it to match up to the 10th character and nothing more. How do I fix this expression?
Update: I will eventually try to incorporated this regex into a C++ program using a boost regex to match.
If supported, you can use a lookbehind with a finite quantifier asserting 10 chars to the left at the end of the pattern:
^[A-Za-z]+:[0-9]+(?<=^.{10})
The pattern matches:
^
Start of string[A-Za-z]+:[0-9]+
Match 1+ chars A-Za-z followed by :
and 1+ digits(?<=^.{10})
Positive lookbehind, assert that from the current position there are 10 characters to the leftIf you want to match trailing whitespace chars:
^[A-Za-z]+:[0-9]+\s*(?<=^.{10})
Note that \s
can also match a newline.