pythonregexcurrency

Can somebody explain a money regex that just checks if the value matches some pattern?


There are multiple posts on here that capture value, but I'm just looking to check to see if the value is something. More vaguely put; I'm looking to understand the difference between checking a value, and "capturing" a value. In the current case the value would be the following acceptable money formats:

Here is a post that explains some about a money regex but I don't understand it a bit.

.50
50
50.00
50.0
$5000.00
$.50

I don't want commas (people should know that's ridiculous).

The thing I'm having trouble with are:

  1. Allowing for a $ at the starting of the value (but still optional)
  2. Allowing for only 1 decimal point (but not allowing it at the end)
  3. Understanding how it's working inside
  4. Also understanding out to get a normalized version (only digits and a the optional decimal point) out of it that strips the dollar sign.

My current regex (which obviously doesn't work right) is:

# I'm checking the Boolean of the following:
re.compile(r'^[\$][\d\.]$').search(value)

(Note: I'm working in Python)


Solution

  • Assuming you want to allow $5. but not 5., the following will accept your language:

    money = re.compile('|'.join([
      r'^\$?(\d*\.\d{1,2})$',  # e.g., $.50, .50, $1.50, $.5, .5
      r'^\$?(\d+)$',           # e.g., $500, $5, 500, 5
      r'^\$(\d+\.?)$',         # e.g., $5.
    ]))
    

    Important pieces to understand:

    The parenthesized subpatterns are capture groups: all text in the input matched by the subexpression in a capture group will be available in matchobj.group(index). The dollar sign won't be captured because it's outside the parentheses.

    Because Python doesn't support multiple capture groups with the same name (!!!) we must search through matchobj.groups() for the one that isn't None. This also means you have to be careful when modifying the pattern to use (?:...) for every group except the amount.

    Tweaking Mark's nice test harness, we get

    for test, expected in tests:
        result = money.match(test) 
        is_match = result is not None
        if is_match == expected:
          status = 'OK'
          if result:
            amt = [x for x in result.groups() if x is not None].pop()
            status += ' (%s)' % amt
        else:
          status = 'Fail'
        print test + '\t' + status
    

    Output:

    .50     OK (.50)
    50      OK (50)
    50.00   OK (50.00)
    50.0    OK (50.0)
    $5000   OK (5000)
    $.50    OK (.50)
    $5.     OK (5.)
    5.      OK
    $5.000  OK
    5000$   OK
    $5.00$  OK
    $-5.00  OK
    $5,00   OK
            OK
    $       OK
    .       OK
    .5      OK (.5)