I have a file from blob storage
newfile = Supervisor_08292024_095618.csv
I want to check if the date is in MMDDYYYY format.
I tried to create a pattern for correct filename pattern:
pattern1 = r'Supervisor_[0-9]{2}[0-9]{2}[0-9]{4}_[0-9]{5}.parquet'
if re.match(pattern1 , newfile ):
file_valid = "True"
else:
file_valid = "False"
print(file_valid)
The result is True because file Supervisor_08292024_095618 met the pattern1 but when I try to change the MM and DD : Supervisor_29082024_095618 .parquet...the result is still True. Which is invalid because the second file is DDMMYYYY...
You can change the pattern like below to achieve your requirement.
import re
pattern = '^Supervisor_(0[1-9]|1[0-2])(0[1-9]|[12][0-9]|3[01])\d{4}_[0-9]{6}.parquet$'
filename = "Supervisor_08292024_095618.parquet"
if re.match(pattern, filename):
file_valid = "True"
else:
file_valid = "False"
print(file_valid)
Result when filename is in correct format:
Result when filename is not in desired format: