
verify emails in PHP

I tried PHP regex to verify emails, such as /^[_a-z0-9-]+(\.[_a-z0-9-])*@[a-z0-9-]+(\.[a-z0-9-])*(\.[a-z]{2,4})$/. I know that it is not a correct way to validate the emails, for is also correct in my regex.

Do I need to enumerate all the domain name suffixes?

I happened to know that in PHP filter_var function can verify emails, however filter_var('', FILTER_VALIDATE_EMAIL) is also correct.

What is the theory of FILTER_VALIDATE_EMAIL in PHP source code? or can someone tell me a better way to verify emails?

Thanks very much!


  • Function php_filter_validate_email from logical_filters.c is used for this check.

    It tests email against following regex /^(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?)){255,})(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?)){65,}@)(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|(?:\\x5C[\\x00-\\x7F]))*\\x22))(?:\\.(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|(?:\\x5C[\\x00-\\x7F]))*\\x22)))*@(?:(?:(?!.*[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-+[a-z0-9]+)*\\.){1,126}){1,}(?:(?:[a-z][a-z0-9]*)|(?:(?:xn--)[a-z0-9]+))(?:-+[a-z0-9]+)*)|(?:\\[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.*[a-f0-9][:\\]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:\\.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))\\]))$/iD and also for the maximum length of 320 characters.

    Also comment from the source:

    * The regex below is based on a regex by Michael Rushton.
    * However, it is not identical. I changed it to only consider routeable
    * addresses as valid. Michael's regex considers a@b a valid address
    * which conflicts with section 2.3.5 of RFC 5321 which states that:
    * Only resolvable, fully-qualified domain names (FQDNs) are permitted
    * when domain names are used in SMTP. In other words, names that can
    * be resolved to MX RRs or address (i.e., A or AAAA) RRs (as discussed
    * in Section 5) are permitted, as are CNAME RRs whose targets can be
    * resolved, in turn, to MX or address RRs. Local nicknames or
    * unqualified names MUST NOT be used.
    * This regex does not handle comments and folding whitespace. While
    * this is technically valid in an email address, these parts aren't
    * actually part of the address itself.
    * Michael's regex carries this copyright:
    * Copyright © Michael Rushton 2009-10
    * Feel free to use and redistribute this code. But please keep this copyright notice.

    This is good enough for most of the real world emails. For more details check out this question: Using a regular expression to validate an email address