c++printf

swprintf truncation causes unexpected output


I'm fixing legacy code that runs on linux and windows and in some cases buffers that are supposed to contain formatted content are smaller than that content.

The code uses swprintf which according to the documentation

size - up to size - 1 characters may be written, plus the null terminator

Will indeed truncate the string, but while trying it on coliru I encountered unexpected results:

#include <iostream> 
#include <string> 
#include <cwchar> 

int main()
{

    wchar_t wide[5];

    std::swprintf(wide, sizeof wide/sizeof *wide, L"%ls", L"111111111");

    std::wcout << wide;
}

will result in 1111?? but

#include <iostream> 
#include <string> 
#include <cwchar> 

int main()
{

    wchar_t wide[20];

    std::swprintf(wide, sizeof wide/sizeof *wide, L"%ls", L"111111111");

    std::wcout << wide;
}

Works just fine.

What's wrong ?

P.S. I wish I could change everything to C++ streams/string but I can't, wchar_t arrays are used everywhere


Solution

  • tl;dr: For one reason or another, those null-termination semantics are dependent on the function call succeeding, and for swprintf it only succeeds if the buffer is big enough. Hence, the array in your first attempt is not null-terminated.


    This is subtle, but swprintf isn't like snprintf. It doesn't write "at most N-1 characters" and consider that successful in all cases.

    Here's what the same documentation says about the return value from swprintf:

    Return value: Number of wide characters written (not counting the terminating null wide character) if successful or negative value if an encoding error occurred or if the number of characters to be generated was equal or greater than size (including when size is zero)

    And, indeed, your attempt returns -1.

    From this (and from the note underneath that quote) we can ascertain that swprintf considers the operation a failure if there weren't enough bytes in the provided output buffer. It won't overrun that buffer, but it also may not complete its work, and its work includes writing a NULL terminator. Without that NULL terminator, the wchar_t* you're [effectively] passing to std::wcout will run out of bounds and your program has undefined behaviour.


    I concede that this, on a casual reading, would seem to contradict the semantics surrounding the size parameter, for which C11 states:

    No more than n wide characters are written, including a terminating null wide character, which is always added (unlessn is zero).

    …without stating any condition on whether the function call was otherwise successful.

    There may be scope for calling this an editorial defect in the standard, or an implementation bug. But even if neither were true, your function call was deemed unsuccessful, and I don't think you should rely on the outcome accordingly.

    We can at least see that the libc intent matches the above run-down, from this manual page on Formatted Output Functions:

    The return value is the number of characters generated for the given input, excluding the trailing null. If not all output fits into the provided buffer a negative value is returned. You should try again with a bigger output string. Note: this is different from how snprintf handles this situation.


    You're going to have to heed the aforementioned note:

    While narrow strings provide std::snprintf, which makes it possible to determine the required output buffer size, there is no equivalent for wide strings, and in order to determine the buffer size, the program may need to call std::swprintf, check the result value, and reallocate a larger buffer, trying again until successful.

    …or switch to some other functionality altogether.