I'm working on a block of Python code that is meant to test inputs to determine whether they're numeric, timestamps, free text, etc. To detect dates, it uses the dateutil parser, then checks if the parse succeeded or an exception was thrown.
However, the dateutil parser is too forgiving and will turn all manner of values into date objects, such as time ranges, eg "12:00-16:00", being converted into timestamps on the current day, eg "2023-08-22T12:00-16:00" (which isn't even a valid timezone offset).
We'd like to only treat inputs as dates if they actually have a day-month-year component, not if they're just hours and minutes - but we still want to accept various date formats, yyyy-MM-ddThh:mm:ss or dd/MM/yyyy or whatever the input uses. Is there another library better suited to this, or some way to make dateutil stricter?
How about the python's re module. You can check string with regular expression to determine whether the string is valid date/datetime data and then you can use dateutil module.
for example, following snippets will determine whether the input string has the proper date pattern.
import re
def check_date(text)
date_regex = re.compile(r"(/d{4}-/d{2}-/d{2}") # for "yyyy-mm-dd" pattern
if re.search(data_regx, text):
return True
else:
return False
Now, depending on the function's return you can use dateutil or datetime module to parse the string into date/datetime object.