This section describes how to scan a string containing multibyte
characters, one character at a time. The difficulty in doing this
is to know how many bytes each character contains. Your program
can use mblen
to find this out.
mblen
function with a non-null string argument returns
the number of bytes that make up the multibyte character beginning at
string, never examining more than size bytes. (The idea is
to supply for size the number of bytes of data you have in hand.)
The return value of mblen
distinguishes three possibilities: the
first size bytes at string start with valid multibyte
character, they start with an invalid byte sequence or just part of a
character, or string points to an empty string (a null character).
For a valid multibyte character, mblen
returns the number of
bytes in that character (always at least 1
, and never more than
size). For an invalid byte sequence, mblen
returns
-1
. For an empty string, it returns 0
.
If the multibyte character code uses shift characters, then mblen
maintains and updates a shift state as it scans. If you call
mblen
with a null pointer for string, that initializes the
shift state to its standard initial value. It also returns nonzero if
the multibyte character code in use actually has a shift state.
See section Multibyte Codes Using Shift Sequences.
Go to the first, previous, next, last section, table of contents.