char8_t
in C++20 fixes some problems of char
, so I was considering using char8_t
instead of char
for utf8 text (e.g. text from command line). But then I noticed that strlen
was not specified in the standard to be used with char8_t
, actually none of the functions in the cstring library are. Can I expect this to happen in a next standard update? Or is char8_t
never intended to replace char
in the way I had in mind?
I'm the author of the P0482 and P1423 char8_t
proposals.
The intent of those proposals was to introduce the char8_t
type with the same level of support present for char16_t
and char32_t
and then to follow up with additional functionality later. These proposals were adopted late in the C++20 development cycle (at the San Diego and Cologne meetings respectively), so there wasn't opportunity to deliver additional features for C++20.
One of the directives for SG16 as described in P1238 is to standardize new encoding aware text container and view types. Work is progressing in this area and we hope to deliver it for C++23. It is hoped that these new containers and views will supplant much raw string handling in C++.
With regard to strlen
specifically, strlen
is a C API. N2231 is a proposal to add char8_t
support to C (again, at the same level as the existing support for char16_t
and char32_t
). That proposal has not yet been accepted by WG14. Assuming it is eventually accepted, then it would make sense to follow up with additional char8_t
-based C string management functions (perhaps enhancing support for char16_t
and char32_t
as well).
At present, I'm working on completing an implementation of N2231 in gcc and glibc. Once that is complete, I intend to submit a revision of N2231 to WG14.
You can help! SG16 is an open group. Please feel free to subscribe to our mailing list, join us on Slack, share your ideas, needs, and wants, and write proposals for new functionality (we can help with how to do that).