c++utf-8utf-16stringstreamuint8t

How to convert a char16_t into a stringstream divided with 2 bytes


I made a utf8 to utf16 conversion where i get the code units for the utf16 char16_t.

{
    std::string u8 = u8"ʑʒʓʔ";

    // UTF-8 to UTF-16/char16_t
    std::u16string u16_conv = std::wstring_convert<
                                  std::codecvt_utf8_utf16<char16_t>, char16_t>{}
                                  .from_bytes(u8);
    std::cout << "UTF-8 to UTF-16 conversion produced "
              << u16_conv.size() << " code units:\n";
    for (char16_t c : u16_conv)
        std::cout << std::hex << std::showbase << c << ' ';

        
}

Output :

TF-8 to UTF-16 conversion produced 4 code units:
0x291 0x292 0x293 0x294

I now need to pass the code units to a stringstream if possible and I don't know how to convert this into a 2 bytes like so :

0x02,0x91,0x02,0x92,0x02,0x93,0x02,0x94

Any suggestions ? Maybe converting it first to a uint8_t vector?


Solution

  • A straightforward way is adding each bytes to std::stringstream using a loop.

    std::stringstream ss;
    for (char16_t c : u16_conv)
    {
        ss << (char)(c >> 8);
        ss << (char)c;
    }
    
    std::string str = ss.str();
    for (char c : str) {
        std::cout << std::hex << std::showbase << (int)(unsigned char)c << ' ';
    }