syntaxemacscpu-worddot-emacsdox

Adjusting syntax table for a derived mode in .emacs


In order to view/edit Doxygen files, I am using

(define-derived-mode dox-mode
  html-mode "dox"
  "Major mode for Doxygen."
  )

(add-to-list 'auto-mode-alist '("\\.dox\\'" . dox-mode))

which works reasonably well for me except that some single quotes (') which are commonly used in text will switch syntax to string-mode back and forth. I found that I can adjust syntax tables like so:

(defvar dox-mode-syntax-table
  (let ((table (make-syntax-table html-mode-syntax-table)))
    (modify-syntax-entry ?' "." table)
    table))

but this has no effect.

So how can I wire that syntax table to dox-mode?


Steps to reproduce

(require 'sgml-mode)

;; Create a syntax table for dox-mode.
;; This must be prior to defining dox-mode.
(defvar dox-mode-syntax-table
  (let ((st (make-syntax-table)))
    (modify-syntax-entry ?'  "w   " st)
    st)
  "Syntax table used while in `dox-mode'.")

;; Our own Doxygen mode.  Same like HTML except that
;; single quotes should not start a string colorization.
(define-derived-mode dox-mode
  html-mode "dox"
  "Major mode for Doxygen.")

(add-hook 'dox-mode-hook (lambda () (modify-syntax-entry ?' "w   ")))

;; Wire dox-files to dox-mode.
(add-to-list 'auto-mode-alist '("\\.dox\\'" . dox-mode))

Then open the dox file faq.dox and go to line 245 where the single-quote in However, there's currently no support for... starts a string but shouldn't.

M-x describe-syntax with that file shows the following which looks good to me:

'       w   which means: word

The parent syntax table is:
The parent syntax table is:
"       ""  which means: string, matches "
'       "'  which means: string, matches '
...

The following, edited copy of the faq.dox content also demonstrates the issue in Emacs when using dox-mode:

\section faq_volatile Why doesn't my program recognize a variable updated in an interrupt routine?

Also notice that global register variables can't be volatile,
because only variables in memory can be volatile, and register variables
are not located in memory.

\code
_BV(3) => 1 << 3 => 0x08
\endcode

However, there's currently no support for \c libstdc++, the standard
support library needed for a complete C++ implementation.  This
imposes a number of restrictions on the C++ programs that can be
compiled.  Among them are:

The paragraph after the \code...\endcode section exhibits the issue with apostrophes, while the preceding paragraphs do not.

emacs version

$ emacs --version
GNU Emacs 27.1

Solution

  • It looks like the font-lock rules for html-mode are confused by the following text:

    _BV(3) => 1 << 3 => 0x08
    

    Further testing shows that it's the imbalanced < and > characters which are the problem (i.e. << 3 =>), which makes sense for HTML.

    A workaround may be to use HTML character encoding:

    _BV(3) =&gt; 1 &lt;&lt; 3 =&gt; 0x08