c++utf-8c-stringsc++20char8-t

Copy a std::u8string into a c-style string of utf8 characters


Copying a string with no encodage into a c-string is quite easy:

auto to_c_str(std::string const& str) -> char* {
    auto dest = new char[str.size() + 1];
    return strcpy(dest, str.c_str());
}

But how can I do that with a std::u8string? Is there a STL algorithm that can help with that?

I tried this:

auto to_c_str(std::u8string const& str) -> char8_t* {
    auto dest = new char8_t[str.size() + 1];
    return std::strcpy(dest, str.c_str());
}

But of course, std::strcpy is not overloaded for utf8 strings.


Solution

  • In addtion to using std::memcpy, you may use std::u8string::copy and std::copy.

    auto to_c_str(std::u8string const& str) -> char8_t* {
        auto dest = new char8_t[str.size() + 1];
        str.copy(dest, str.size(), 0);
        dest[str.size()] = u8'\0';
        return dest;
    }
    
    auto to_c_str(std::u8string const& str) -> char8_t* {
        auto dest = new char8_t[str.size() + 1];
        std::copy(str.begin(), str.end(), dest);
        dest[str.size()] = u8'\0';
        return dest;
    }