javascriptc++windowsurlencodeextended-ascii

URL Encoding an Extended ASCII std::string in C++


I have an std::string filled with extended ASCII values (e.g. čáě). I need to URL encode this string for JavaScript to decode with DecodeURIComponent.

I have tried converting it to UTF-16 and then to UTF-8 via the windows-1252 codepoint, but wasn't able to do so as there is not enough examples for the MultiByteToWideChar and WideCharToMultiByte functions.

I am compiling with MSVC-14.0 on Windows 10 64-bit.

How can I at least iterate over the individual bytes of the final UTF-8 string for me to URL encode?

Thanks


Solution

  • You can use MultiByteToWideChar to convert the string to UTF-16 and then encode the chars one by one.

    Example code:

    std::string readData = "Extended ASCII characters (ěščřžýáíé)";
    int size = MultiByteToWideChar(
        1252, //1252 corresponds with windows-1252 codepoint
        0,
        readData.c_str(),
        -1, //the string is null terminated, no need to pass the length
        NULL,
        0
    );
    wchar_t* wchar_cstr = new wchar_t[size];
    MultiByteToWideChar(
        1252,
        0,
        readData.c_str(),
        -1,
        wchar_cstr,
        size
    );
    std::stringstream encodeStream;
    for(uint32_t i = 0; i < size; i++){
        wchar_t wchar = wchar_cstr[i];
        uint16_t val = (uint16_t) wchar;
        encodeStream << "%" << std::setfill('0') << std::setw(2) << std::hex << val;
    }
    delete[] wchar_cstr;
    
    std::string encodedString = encodeStream.str(); // the URL encoded string
    

    While this does encode the basic ASCII characters ( < 128 ) it is completely decodable by JavaScript, which was the end goal.