regexemailrfc2822

regex to validate a message-ID as per RFC2822


I have not found a regexp to do this. I need to validate the "Message-ID:" value from an email. It is similar to a email address validation regexp but much simpler, without most of the edge cases the email address allows, from rfc2822

msg-id          =       [CFWS] "<" id-left "@" id-right ">" [CFWS] 
id-left         =       dot-atom-text / no-fold-quote / obs-id-left
id-right        =       dot-atom-text / no-fold-literal / obs-id-right
no-fold-quote   =       DQUOTE *(qtext / quoted-pair) DQUOTE
no-fold-literal =       "[" *(dtext / quoted-pair) "]"

Let's say the outter <> are optional. dot-atom-text and missing definitions can be found in rfc2822

I am not proficient in regex and I prefer to use an already tested one, if exists.


Solution

  • As I could not find any I ended up implementing it myself. It is not a proper validation as per RFC2822 but a good enough aproximation for now:

    static String VALIDMIDPATTERN = "[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*";
    private static Pattern patvalidmid = Pattern.compile(VALIDMIDPATTERN);
    
    public static boolean isMessageIdValid(String midt) {
        String mid = midt;
        if (StringUtils.countMatches(mid, "<") > 1)
            return false;
        if (StringUtils.countMatches(mid, ">") > 1)
            return false;
        if (StringUtils.containsAny(mid, "<>")) {
            mid = StringUtils.substringBetween(mid, "<", ">");
            if (StringUtils.isBlank(mid)) {
                return false;
            }
        }
        if (StringUtils.contains(mid, "..")) {
            return false;
        }
        //extract from <>
        mid = mid.trim();
        //now validate
        Matcher m = patvalidmid.matcher(mid);
        return m.matches();
    }