NAME

mblen — get number of bytes in a character

SYNOPSIS

#include <stdlib.h>

int mblen(const char *
s, size_t n);

DESCRIPTION

[CX] [Option Start] Except for requirements relating to data races, the functionality described on this reference page is aligned with the ISO C standard. Any other conflict between the requirements described here and the ISO C standard is unintentional. This volume of POSIX.1-2024 defers to the ISO C standard for all mblen() functionality except in relation to data races. [Option End]

If s is not a null pointer, mblen() shall determine the number of bytes constituting the character pointed to by s. Except that the shift state of mbtowc() is not affected, it shall be equivalent to:

mbtowc((wchar_t *)0, (const char *)0, 0);
mbtowc((wchar_t *)0, s, n);

The implementation shall behave as if no function defined in this volume of POSIX.1-2024 calls mblen().

The behavior of this function is affected by the LC_CTYPE category of the current locale. For a state-dependent encoding, this function shall be placed into its initial state at program startup and can be returned to that state by a call for which its character pointer argument, s, is a null pointer. Subsequent calls with s as other than a null pointer shall cause the internal state of the function to be altered as necessary. A call with s as a null pointer shall cause this function to return a non-zero value if encodings have state dependency, and 0 otherwise. If the implementation employs special bytes to change the shift state, these bytes shall not produce separate wide-character codes, but shall be grouped with an adjacent character. Changing the LC_CTYPE category causes the shift state of this function to be unspecified.

The mblen() function [CX] [Option Start]  need not be thread-safe; however, it [Option End]  shall avoid data races with all other functions.

RETURN VALUE

If s is a null pointer, mblen() shall return a non-zero or 0 value, if character encodings, respectively, do or do not have state-dependent encodings. If s is not a null pointer, mblen() shall either return 0 (if s points to the null byte), or return the number of bytes that constitute the character (if the next n or fewer bytes form a valid character), or return -1 (if they do not form a valid character) [CX] [Option Start]  and may set errno to indicate the error. [Option End] In no case shall the value returned be greater than n or the value of the {MB_CUR_MAX} macro.

ERRORS

The mblen() function may fail if:

[EILSEQ]
[CX] [Option Start] An invalid character sequence is detected. In the POSIX locale an [EILSEQ] error cannot occur since all byte values are valid characters. [Option End]

The following sections are informative.

EXAMPLES

None.

APPLICATION USAGE

None.

RATIONALE

When the ISO C standard introduced threads in C11, it required mblen() to avoid data races (with itself as well as with other functions), whereas POSIX.1-2008 did not require it to be thread-safe, and in many implementations it did not avoid data races with itself and still does not. The ISO C committee intend to change the requirements in a future version of the ISO C standard, but since POSIX.1 currently refers to C17 it is necessary for it not to defer to the ISO C standard regarding data races in order to continue to allow this function not to avoid data races with itself.

FUTURE DIRECTIONS

It is expected that a change in a future version of the ISO C standard will allow a future version of this standard to remove the data race exception from the statement that it defers to the ISO C standard.

SEE ALSO

mbtowc , mbstowcs , wctomb , wcstombs

XBD <stdlib.h>

CHANGE HISTORY

First released in Issue 4. Aligned with the ISO C standard.

Issue 7

POSIX.1-2008, Technical Corrigendum 1, XSH/TC1-2008/0367 [109] is applied.

POSIX.1-2008, Technical Corrigendum 2, XSH/TC2-2008/0204 [663,674] is applied.

Issue 8

Austin Group Defects 708 and 1302 are applied, aligning this function with the ISO/IEC 9899:2018 standard, except in relation to data races.

End of informative text.