I am trying to use Emacs's sentence-end
functionality to navigate around a file with citations (and double-spaces between sentences), and I'm having some trouble trying to get Emacs to recognize sentences with brackets/parentheses after a period. For example:
Some text. "Some quote." (Something in parentheses.) Something with a citation.[cite] Some more text.
Using Emacs's built-in forward-sentence
and backward-sentence
works fine with the first, second, and third double-space sentence breaks in the example above, but the fourth sentence break is not recognized.
Interestingly, enforcing (setq sentence-end-without-period t)
seems to also ignore double spaces when a set of brackets are in place. In the example below, Emacs can delineate between sentences one, two, and three—but it will combine sentences four and five:
Sentence one Sentence two. Sentence three Sentence four.[cite] Sentence five.
Is there a way to make Emacs demarcate sentences by ALL instances of double-spacing, at least after parentheses/brackets? Thanks!
The docstring of a user option sentence-end-base
says, that it is a variable set to a
Regexp matching the basic end of a sentence, not including following space.
The default regexp is [.?!…‽][]\"'”’)}»›]*
currently. It translates into matching one of .?!…‽
, including followed by any number of ]\"')}
.
Hence, it should identify the sentences in the test snippets you have provided with a condition there's a period right before a closing paren/bracket.
[I agree we may expect sentences without a period to be identified having sentence-end-without-period
disabled; this requires more look into how exactly the function sentence-end
utilizes these two variables. Looks like it works only for the cases without paren/bracket involved. This might be a bug in Emacs behavior or --more probably-- documentation not explaining it well enough, so M-x report-emacs-bug
perhaps.]
If the default behavior doesn't suit your needs, adjusting the value of sentence-end-base
could be a solution. For example, consider any closing bracket (before spaces) to be a mark of an ending sentence with
(setq sentence-end-base "[.?!…‽][]\"'”’)}»›]*\\|]")
[Illustration below uses a bit different regexp syntax. All the illustrations match the behavior I have in Emacs.]