emacselisp

Emacs lisp: get sub-matches from a regexp match


I have a string variable somewhere in elisp code, and want to extract some parts of it into other variables using a regular expression with groupings. That's something that you can write in 1-2 lines in any language:

my ($user, $domain) = $email =~ m/^(.+)@(.+)$/;

How do I write the same in elisp?


Solution

  • The GNU Emacs Lisp Reference Manual is your friend. See also http://emacswiki.org/emacs/ElispCookbook (though at the time, the latter did not yet contain an example of this particular technique).

    (save-match-data ; is usually a good idea
      (and (string-match "\\`\\([^@\n]+\\)@\\([^@\n]+\\)\\'" email)
           (setq user (match-string 1 email)
                 domain (match-string 2 email) ) ))
    

    Since several commenters asked about this, here is a breakdown of this particular regex:

    I use single backslashes here, but of course, inside an elisp string, these need to be doubled.


    Strictly speaking, there are many more characters which aren't permitted in email addresses. On the other hand, many beginners restrict way too far, and prohibit characters which are actually allowed in email addresses... and then publish their brain stains on the Internet and tell us "this will 100% work!!"; but I digress. You should take care to allow ., -, +, and * in particular. The full RFC5321 spec cannot easily be captured by a regular expression alone, but in practice, many of the esoteric corner cases are disallowed by lots of broken software anyway; so depending on your exact use case, you will probably be fine with something like [^@<>'\"\s ].