When we first start thinking about how to count the words in a
function definition, the first question is (or ought to be) what are
we going to count? When we speak of `words' with respect to a Lisp
function definition, we are actually speaking, in large part, of
`symbols'. For example, the following multiply-by-seven
function contains the five symbols defun
,
multiply-by-seven
, number
, *
, and 7
. In
addition, in the documentation string, it contains the four words
`Multiply', `NUMBER', `by', and `seven'. The
symbol `number' is repeated, so the definition contains a total
of ten words and symbols.
(defun multiply-by-seven (number) "Multiply NUMBER by seven." (* 7 number))
However, if we mark the multiply-by-seven
definition with
C-M-h (mark-defun
), and then call
count-words-region
on it, we will find that
count-words-region
claims the definition has eleven words, not
ten! Something is wrong!
The problem is twofold: count-words-region
does not count the
`*' as a word, and it counts the single symbol,
multiply-by-seven
, as containing three words. The hyphens are
treated as if they were interword spaces rather than intraword
connectors: `multiply-by-seven' is counted as if it were written
`multiply by seven'.
The cause of this confusion is the regular expression search within
the count-words-region
definition that moves point forward word
by word. In the canonical version of count-words-region
, the
regexp is:
"\\w+\\W*"
This regular expression is a pattern defining one or more word constituent characters possibly followed by one or more characters that are not word constituents. What is meant by `word constituent characters' brings us to the issue of syntax, which is worth a section of its own.
Go to the first, previous, next, last section, table of contents.