pythondatepandasarrow-python

parsing dates in the form of strings using arrow date manipulating python library


I am looking to convert the string "september 20 2010" to a python datetime.date object using arrow.

I have written functions to replace portions of the text and ended up with 9/20/2016 but I want YYYY-MM-DD format and can't seem to get arrow to recognise my string and convert it to a python datetime.date object (without any time).

What has worked and what hasn't.

arrow.get('september 20 2010', '%B %d %Y')

this doesn't work for me I get an error: ParserError: Failed to match '%B %(?P<d>[1-7]) %Y' when parsing the string "september 20 2010"

However when I manipulate the string and then use arrow.Arrow(y,m,d).date(), the result is a datetime.date(2016, 9, 20) object.

I just can't convert it to any other format using .format('dddd-DD-MMMM-YYYY') which would return Monday 20 Septemb 2010.


Solution

  • Using arrow, you have to match the exact syntax of your string, here is the list of the associated token.

    arrow.get('September 20 2010', 'MMMM D YYYY')

    Note: In this very case, there is only one D because it cover the number with one or two digits 1, 2, 3... 29, 30 while DD cover the number with two digits only 01, 02, 03 ... 29, 30

    Once you get your arrow object, you can display it however you like using format() :

    ar = arrow.get('September 20 2010', 'MMMM D YYYY')
    print(ar.format('YYYY-MM-DD')) # 2010-09-20
    

    EDIT

    To answer your comment, ar is an Arrow object and you can check every method it contained with dir

    Arrow have a method date() which returns a datetime.date object.

    Now, if you want to use pandas, that's easy:

    import array
    import pandas as pd
    
    ar = arrow.get('September 20 2010', 'MMMM D YYYY')
    df = pd.to_datetime(ar.date())
    print(df) # 2010-09-20 00:00:00