pythonparsingpython-dateutil

dateutil parser returns a bad date when it should raise a ValueError


from dateutil.parser import parse
def convert_to_ddmmyy(date_string):
    try:
        date = parse(date_string)
        formatted_date = date.strftime("%d%m%y")
        return formatted_date
    except ValueError:
        return "Invalid date format"
# Example usage
date_string = "13.75"
converted_date = convert_to_ddmmyy(date_string)
print(converted_date)

This prints 130623.

I want this to be printed as invalid or something along that lines, what changes I can make? It has to be using parser as I am dealing with various date formats, but I need to exclude this format.


Solution

  • Use Regex to recognize the invalid format: the pattern \d\d\.\d\d rejects any string in the format 'xx.xx' where x is a decimal digit. This deals with your specific case. The second expression deals with a range of formats as explained below.

    import re
    from dateutil.parser import parse
    
    def convert_to_ddmmyy(date_string):
        # if re.match(r"\d\d\.\d\d",date_string): # rejects basic pattern xx.xx
        if re.fullmatch(r"\d*\.\d*",date_string): # rejects wider range of similar patterns
           return "Invalid date format"
        try:
            date = parse(date_string)
            formatted_date = date.strftime("%d%m%y")
            return formatted_date
        except ValueError:
            return "Invalid date format"
    # Example usage
    date_string = "13.75"
    converted_date = convert_to_ddmmyy(date_string)
    print(converted_date)
    

    gives

    Invalid date format
    

    Here's an explanation. match looks for a pattern in the string and returns True or False. \d means a digit. \ is used to make this a control value and not look for the letter d; \. is needed to look for the decimal point because this is otherwise used in regex (to accept anything). * means none or more of the previous. So the second pattern looks for none or any digits, a decimal point (full stop), none or any digits. However match would reject 2023.12.12 (which might be a date) because it sees the patterns 2023.12 and 12.12; but fullmatch checks for the whole string to match the pattern so would accept this string. Hope this is useful. Regex has lots of uses in checking and changing strings.