I'm trying to learn how to write emacs major-modes. There are lots of great tutorials online (e.g. http://www.emacswiki.org/emacs/GenericMode), but I'm struggling to learn the syntax for regex matching. For example, from this answer I'm trying to understand why
'(("\"\\(\\(?:.\\|\n\\)*?[^\\]\\)\""
from
(define-derived-mode rich-text-mode text-mode "Rich Text"
"text mode with string highlighting."
;;register keywords
(setq rich-text-font-lock-keywords
'(("\"\\(\\(?:.\\|\n\\)*?[^\\]\\)\"" 0 font-lock-string-face)))
(setq font-lock-defaults rich-text-font-lock-keywords)
(font-lock-mode 1))
matches anything between double quotation marks. This material: http://www.gnu.org/software/emacs/manual/html_node/elisp/Regexp-Special.html#Regexp-Special doesn't seem to explain that.
Are there any better resources out there?
An answer to your question of what the regexp does ---
The regexp in the example you cite is actually "\"\\(\\(?:.\\|\n\\)*?[^\\]\\)\""
.
The parts to match are:
\"
, which matches only a "
char --- this is at the beginning and the end of the regexp.
A group, which contains \\(?:.\\|\n\\)*?
followed by [^\\]
. The group is presumably there so that font-lock-keywords
can be told to do something with that part of a match, i.e., the part between the matching "
at the beginning and end.
\\(?:.\\|\n\\)*?
, the first part of the group, matches zero or more characters --- any characters. The *?
could be just *
(same thing). The .
matches any char except a newline char, and the \n
matches a newline char. The \\|
means either of those is OK.
[^\\]
matches any character except a backslash (\
).
So putting it together, the group matches zero or more chars followed by a char that is not a backslash. Why not just use a regexp that matches zero or more chars between "
chars? Presumably because the person wanted to make sure the ending "
was not escaped (by a backslash). However, note that the regexp requires there to be at least one char between the "
chars, so that regexp does not match the empty string, ""
.
A good resource is: http://www.emacswiki.org/emacs/RegularExpression.