Go to the first, previous, next, last section, table of contents.


Enabling Multibyte Characters

You can enable or disable multibyte character support, either for Emacs as a whole, or for a single buffer. When multibyte characters are disabled in a buffer, then each byte in that buffer represents a character, even codes 0200 through 0377. The old features for supporting the European character sets, ISO Latin-1 and ISO Latin-2, work as they did in Emacs 19.

However, there is no need to turn off multibyte character support to use ISO Latin-1 or ISO Latin-2; the Emacs multibyte character set includes all the characters in these character sets, and Emacs can translate automatically to and from either of these ISO codes.

To edit a particular file in unibyte representation, visit it using find-file-literally. See section Visiting Files. To convert a buffer in multibyte representation into a single-byte representation of the same characters, the easiest way is to save the contents in a file, kill the buffer, and find the file again with find-file-literally.

To turn off multibyte character support by default, start Emacs with the `--unibyte' option (see section Initial Options), or set the environment variable `EMACS_UNIBYTE'.

The mode line indicates whether multibyte character support is enabled in the current buffer. If it is, there are two or more characters (most often two dashes) before the colon near the beginning of the mode line. When multibyte characters are not enabled, just one dash precedes the colon.

When multibyte characters are enabled, character codes 0240 (octal) through 0377 (octal) are not really legitimate in the buffer. The valid non-ASCII printing characters have codes that start from 0400.

If you type a self-inserting character in the invalid range 0240 through 0377, Emacs assumes you intended to use one of the ISO Latin-n character sets, and converts it to the Emacs code representing that Latin-n character. You select which ISO Latin character set to use though your choice of language environment (see below). If you do not specify a choice, the default is Latin-1.

The same thing happens when you use C-q to enter an octal code in this range.


Go to the first, previous, next, last section, table of contents.