pythonarrow-python

Arrow successfully parses some input not in default patterns


Given invalid input, arrow raises ParserError:

>>> arrow.get('abc')
ParserError: Could not match input to any of [u'YYYY-MM-DD', u'YYYY/MM/DD', u'YYYY.MM.DD', u'YYYY-MM', u'YYYY/MM', u'YYYY.MM', u'YYYY', u'YYYY', u'YYYY'] on 'abc'
>>> arrow.get('09-10-201')
ParserError: Could not match input to any of [u'YYYY-MM-DD', u'YYYY/MM/DD', u'YYYY.MM.DD', u'YYYY-MM', u'YYYY/MM', u'YYYY.MM', u'YYYY', u'YYYY', u'YYYY'] on '09-10-201'

This shows all matching patterns that arrow had tried before raising the exception. However, occasionally, even when the input does not match any of these patterns, it silently converts it to an object:

>>> arrow.get('09-10-2017')
<Arrow [2017-01-01T00:00:00+00:00]>  # Succeeds with incorrect date

Is this explained by additional hidden parse patterns provided by my system locale? If so, why would it parse 2017 and leave out 09 and 10? If not, why did the parsing succeed?


Solution

  • Arrow uses regex to match a given string with a date format.

    For example:

    arrow.get('aaa2012-01-21aa')
    

    is accepted as input

    <Arrow [2012-01-21T00:00:00+00:00]>
    

    because it matches the format YYYY-MM-DD which internally was converted to a regex of the form '(?P<YYYY>\d{4})-(?P<MM>\d{2})-(?P<DD>\d{2})' and this regex capture the case.

    Your input has a match only for YYYY and the rest of the string being discarded.

    For raising an error and forcing a specific format, the advice given by @asongtoruin is a very good one.