regexemail2-digit-year

Regex: Email contain numbers but they are not on year pattern


I'm new using regex and, after some tutorials, I'm getting difficult to implement an match criteria like "Email contain numbers but they are not on year pattern".

So, I have this simple regex for "contain numbers":

\d+(?=@)

Considering that e-mail address does have numbers, I would like to get a match for expressions NOT being in one of these below:

\w*(19|20)\d{2}\D*(?=@)
\w*[6-9][0-9]\D*(?=@)
\w*[0-1][0-9]\D*(?=@)

How, in regex, can I express this?

Example matching inputs:

foo123@gmail.com
a22oo@hotmail.com
hoo567@outlook.com

Example non-matching inputs:

foo@gmail.com
johndoe88@hotmail.com
john1976@outlook.com

Solution

  • Regex is difficult to invert, i.e. to not match something.

    In your simple case I would just parse an arbitrary long number, and then do the check in code, preferably after converting it to an integer.

    But to your question, the following would invert the cases, just or them together

    (\d)|                            1 digit
    ([2345]\d)|                      2 digits not starting with 0,1,6,7,8,9
    (\d\d\d)|                        3 digits
    ((1[^9]|2[^0]|[03-9]\d)\d\d)|    4 digits not starting with 19 or 20 
    (\d\d\d\d\d*)                    5+ digits
    

    Something like this. I'm sure someone can make it prettier.

    EDIT

    Here is the full regex now tested properly with all possible cases I can think of matching your specified criteria, and proper boundary tests (see https://regex101.com/r/sM5aF7/1):

    (\b|[^\d\s])(\d|[2345]\d|\d{3}|(1[^9]|2[^0]|[03-9]\d)\d\d|\d{5,})(\D*?@|@)