sentence-end
The symbol sentence-end
is bound to the pattern that marks the
end of a sentence. What should this regular expression be?
Clearly, a sentence may be ended by a period, a question mark, or an exclamation mark. Indeed, only clauses that end with one of those three characters should be considered the end of a sentence. This means that the pattern should include the character set:
[.?!]
However, we do not want forward-sentence
merely to jump to a
period, a question mark, or an exclamation mark, because such a character
might be used in the middle of a sentence. A period, for example, is
used after abbreviations. So other information is needed.
According to convention, you type two spaces after every sentence, but only one space after a period, a question mark, or an exclamation mark in the body of a sentence. So a period, a question mark, or an exclamation mark followed by two spaces is a good indicator of an end of sentence. However, in a file, the two spaces may instead be a tab or the end of a line. This means that the regular expression should include these three items as alternatives. This group of alternatives will look like this:
\\($\\| \\| \\) ^ ^^ TAB SPC
Here, `$' indicates the end of the line, and I have pointed out where the tab and two spaces are inserted in the expression. Both are inserted by putting the actual characters into the expression.
Two backslashes, `\\', are required before the parentheses and vertical bars: the first backslash to quote the following backslash in Emacs; and the second to indicate that the following character, the parenthesis or the vertical bar, is special.
Also, a sentence may be followed by one or more carriage returns, like this:
[ ]*
Like tabs and spaces, a carriage return is inserted into a regular expression by inserting it literally. The asterisk indicates that the RET is repeated zero or more times.
But a sentence end does not consist only of a period, a question mark or an exclamation mark followed by appropriate space: a closing quotation mark or a closing brace of some kind may precede the space. Indeed more than one such mark or brace may precede the space. These require a expression that looks like this:
[]\"')}]*
In this expression, the first `]' is the first character in the expression; the second character is `"', which is preceded by a `\' to tell Emacs the `"' is not special. The last three characters are `'', `)', and `}'.
All this suggests what the regular expression pattern for matching the
end of a sentence should be; and, indeed, if we evaluate
sentence-end
we find that it returns the following value:
sentence-end => "[.?!][]\"')}]*\\($\\| \\| \\)[ ]*"
Go to the first, previous, next, last section, table of contents.