regexnsregularexpression

Regular expression to match numbers, excluding those ending with a trailing comma or period


I am trying to write a regular expression to match valid numbers that may include signs (+/-), commas for thousand separators, and decimal points, but I want to exclude numbers with trailing commas or periods.

Examples of valid numbers:

12 +12 -12.0 -12,345.5466 +12,345,678,678 ,0.154

Examples of invalid numbers (should not match): 500, 500.

I want 500 to match as a valid number, but not 500, or 500.

The regular expression I have written is:

[-+]?((0|([1-9](\d*|\d{0,2}(,\d{3})*)))(\.\d*[0-9])?)(?!\S)

This regex works for most of the valid examples, but it also incorrectly matches invalid cases like 500, and 500.

The issue is that the regex does not exclude numbers with a trailing comma or period (500, or 500.). I need help modifying the regex to exclude such cases while still matching the valid ones.

How can I adjust my regular expression to ensure it matches valid numbers but excludes numbers with trailing commas or periods?


Solution

  • Assuming you want to match 500 in 500. and 500,, you should bear in mind that (?!\S) requires a whitespace or end of string immediately to the right.

    You may fix the problem with

    [-+]?(?:0|[1-9](?:\d{0,2}(?:,\d{3})*|\d*))(?:\.\d+)?(?!\d)
    

    See this regex demo, and note that this can be further enhanced depending on what contexts you need to exclude.

    I replaced (?!\S) with (?!\d) at the end to fail the match if there is a digit, not any non-whitespace char, immediately on the right.

    Note also that I removed unnecessary groups and converted all capturing groups to non-capturing.

    Also, pay attention to the (?:,\d{3})*|\d*) group, where I swapped the alternatives since the first one is more specific and should go first.

    Details