mbsrtowcs, mbsrtowcs_s

From cppreference.com
< c‎ | string‎ | multibyte
Defined in header <wchar.h>
(1)
size_t mbsrtowcs( wchar_t* dst, const char** src, size_t len, mbstate_t* ps );
(since C95)
size_t mbsrtowcs( wchar_t *restrict dst, const char **restrict src, size_t len,
                  mbstate_t *restrict ps);
(since C99)
errno_t mbsrtowcs_s( size_t *restrict retval,

                     wchar_t *restrict dst, rsize_t dstsz,
                     const char **restrict src, rsize_t len,

                     mbstate_t *restrict ps);
(2) (since C11)
1) Converts a null-terminated multibyte character sequence, which begins in the conversion state described by *ps, from the array whose first element is pointed to by *src to its wide character representation. If dst is not null, converted characters are stored in the successive elements of the wchar_t array pointed to by dst. No more than len wide characters are written to the destination array. Each multibyte character is converted as if by a call to mbrtowc. The conversion stops if:
* The multibyte null character was converted and stored. *src is set to NULL and *ps represents the initial shift state.
* An invalid multibyte character (according to the current C locale) was encountered. *src is set to point at the beginning of the first unconverted multibyte character.
* the next wide character to be stored would exceed len. *src is set to point at the beginning of the first unconverted multibyte character. This condition is not checked if dst==NULL.
2) Same as (1), except that
* the function returns its result as an out-parameter retval
* if no null character was written to dst after len wide characters were written, then L'\0' is stored in dst[len], which means len+1 total wide characters are written
* the function clobbers the destination array from the terminating null and until dstsz
* If src and dst overlap, the behavior is unspecified.
* the following errors are detected at runtime and call the currently installed constraint handler function:
  • retval, ps, src, or *src is a null pointer
  • dstsz or len is greater than RSIZE_MAX/sizeof(wchar_t) (unless dst is null)
  • dstsz is not zero (unless dst is null)
  • There is no null character in the first dstsz multibyte characters in the *src array and len is greater than dstsz (unless dst is null)
As with all bounds-checked functions, mbsrtowcs_s is only guaranteed to be available if __STDC_LIB_EXT1__ is defined by the implementation and if the user defines __STDC_WANT_LIB_EXT1__ to the integer constant 1 before including wchar.h.

Parameters

dst - pointer to wide character array where the results will be stored
src - pointer to pointer to the first element of a null-terminated multibyte string
len - number of wide characters available in the array pointed to by dst
ps - pointer to the conversion state object
dstsz - max number of wide characters that will be written (size of the dst array)
retval - pointer to a size_t object where the result will be stored

Return value

1) On success, returns the number of wide characters, excluding the terminating L'\0', written to the character array. If dst==NULL, returns the number of wide characters that would have been written given unlimited length. On conversion error (if invalid multibyte character was encountered), returns (size_t)-1, stores EILSEQ in errno, and leaves *ps in unspecified state.
2) zero on success (in which case the number of wide characters excluding terminating zero that were, or would be written to dst, is stored in *retval), non-sero on error. In case of a runtime constraint violation, stores (size_t)-1 in *retval (unless retval is null) and sets dst[0] to L'\0' (unless dst is null or dstmax is zero or greater than RSIZE_MAX)

Example

#include <stdio.h>
#include <locale.h>
#include <wchar.h>
#include <string.h>
 
void print_as_wide(const char* mbstr)
{
    mbstate_t state;
    memset(&state, 0, sizeof state);
    size_t len = 1 + mbsrtowcs(NULL, &mbstr, 0, &state);
    wchar_t wstr[len];
    mbsrtowcs(&wstr[0], &mbstr, len, &state);
    wprintf(L"Wide string: %ls \n", wstr);
    wprintf(L"The length, including L'\\0': %zu\n", len);
}
 
int main(void)
{
    setlocale(LC_ALL, "en_US.utf8");
    print_as_wide(u8"z\u00df\u6c34\U0001f34c"); // u8"zß水🍌"
}

Output:

Wide string: zß水🍌
The length, including L'\0': 5

References

  • C11 standard (ISO/IEC 9899:2011):
  • 7.29.6.4.1 The mbsrtowcs function (p: 445)
  • K.3.9.3.2.1 The mbsrtowcs_s function (p: 648-649)
  • C99 standard (ISO/IEC 9899:1999):
  • 7.24.6.4.1 The mbsrtowcs function (p: 391)

See also

converts a narrow multibyte character string to wide string
(function)
converts the next multibyte character to wide character, given state
(function)
converts a wide string to narrow multibyte character string, given state
(function)