I've been implementing a codecvt for handling indentiation of output streams. It can be used like this and works fine:
std::cout << indenter::push << "im indentet" << indenter::pop << "\n im not..."
However, while I can imbue an std::codecvt
to any std::ostream
I was very confused when I found out that my code worked with std::cout
as well as std::ofstream
, but not for example for std::ostringstream
even while all of which inherit from the base class std::ostream
.
The facet is constructed normally, the code compiles, it doesn't throw any exceptions... It's just that none of the member functions of the std::codecvt
are called.
For me that is very confusing and I had to spend a lot of time figuring out that std::codecvt
won't do anything on non file I/O streams.
Is there any reason std::codecvt
is not being used by all classes inherited by std::ostream
?
Furthermore does anyone have an idea on which structs I could fall back on to implement the indenter?
Edit: this is the part of the language I'm referring to:
All file I/O operations performed through std::basic_fstream use the std::codecvt<CharT, char, std::mbstate_t> facet of the locale imbued in the stream.
Source: https://en.cppreference.com/w/cpp/locale/codecvt
I've made a small example illustrating my problem:
#include <iostream>
#include <locale>
#include <fstream>
#include <sstream>
static auto invocation_counter = 0u;
struct custom_facet : std::codecvt<char, char, std::mbstate_t>
{
using parent_t = std::codecvt<char, char, std::mbstate_t>;
custom_facet() : parent_t(std::size_t { 0u }) {}
using parent_t::intern_type;
using parent_t::extern_type;
using parent_t::state_type;
virtual std::codecvt_base::result do_out (state_type& state, const intern_type* from, const intern_type* from_end, const intern_type*& from_next,
extern_type* to, extern_type* to_end, extern_type*& to_next) const override
{
while (from < from_end && to < to_end)
{
*to = *from;
to++;
from++;
}
invocation_counter++;
from_next = from;
to_next = to;
return std::codecvt_base::noconv;
}
virtual bool do_always_noconv() const throw() override
{
return false;
}
};
std::ostream& imbueFacet (std::ostream& ostream)
{
ostream.imbue(std::locale { ostream.getloc(), new custom_facet{} });
return ostream;
}
int main()
{
std::ios::sync_with_stdio(false);
std::cout << "invocation_counter = " << invocation_counter << "\n";
{
auto ofstream = std::ofstream { "testFile.txt" };
ofstream << imbueFacet << "test\n";
}
std::cout << "invocation_counter = " << invocation_counter << "\n";
{
auto osstream = std::ostringstream {};
osstream << imbueFacet << "test\n";
}
std::cout << "invocation_counter = " << invocation_counter << "\n";
}
I would except invocation_counter
to increase after streaming in the std::ostringstream
, but it doesn't.
After more research I found out that I could use std::wbuffer_converter
. To quote https://en.cppreference.com/w/cpp/locale/wbuffer_convert
std::wbuffer_convert
is a wrapper over stream buffer of typestd::basic_streambuf<char>
which gives it the appearance ofstd::basic_streambuf<Elem>
. All I/O performed throughstd::wbuffer_convert
undergoes character conversion as defined by the facet Codecvt. [...]This class template makes the implicit character conversion functionality of
std::basic_filebuf
available for anystd::basic_streambuf
.
This way I can apply a facet to a std::ostringstream
:
auto osstream = std::ostringstream {};
osstream << "test\n";
auto facet = custom_facet{};
std::wstring_convert<custom_facet, char> conv;
auto str = conv.to_bytes(osstream.str());
However, I lose the ability to concate facets using the streaming operator <<
.
This confuses me even more why the std::codecvt
is not implicity used by ALL output streams. All output streams inherit from std::basic_streambuf
whose interface is suitable to using std::codecvt
, which is just using an input and an output character sequence, fully implemented in std::basic_streambuf
.
So why is the parsing of std::codecvt
implemented in std::basic_filebuf
instead of std::basic_streambuf
? std::basic_filebuf
inherits std::basic_streambuf
after all...
Either I have some fundamental misunderstanding on how streams work in C++ or std::codecvt
is poorly integrated in the standard. Maybe this is why it is marked as deprecated?
The std::codecvt
facet was originally intended to handle I/O conversions between disk and memory character representation. Quoted from paragraph 39.4.6
of Bjarne Stroustrup's The C++ Programming Language fourth edition:
Sometimes, the representation of characters stored in a file differs from the desired representation of those same characters in main memory. ... the codecvt facet provides a mechanism for converting characters from one representation to another as they are read or written.
The intended purpose was thus to use std::codecvt
only for adapting characters between file (disk) and memory, which partly answers your question:
Why is std::codecvt only used by file I/O streams?
From the docs we see that:
All file I/O operations performed through
std::basic_fstream<CharT>
use thestd::codecvt<CharT, char, std::mbstate_t>
facet of the locale imbued in the stream.
Which then answers the question why std::ofstream
(uses a file-based streambuffer) and std::cout
(linked to standard output FILE stream) invokes std::codecvt
.
Now, to use the high-level std::ostream
interface you need to provide an underlying streambuf
. The std::ofstream
provides a filebuf
and the std::ostringstream
provides a stringbuf
(which is not linked to the use of std::codecvt
). See this post over the streams, which also highlights the following:
...in the case of ofstream, there are also a few extra functions which forward to additional functions in the filebuf interface
But, to invoke the character conversion functionality of a std::codecvt
when you have a std::ostringstream
which is a std::ostream
with an underlying std::basic_streambuf
you can use, as indicated in your post, the std::wbuffer_convert
.
You have only used the std::wstring_convert
in your second update and not the std::wbuffer_convert
.
When using the std::wbuffer_convert
you can wrap the original std::ostringstream
with a std::ostream
as follows:
// Create a std::ostringstream
auto osstream = std::ostringstream{};
// Create the wrapper for the ostringstream
std::wbuffer_convert<custom_facet, char> wrapper(osstream.rdbuf());
// Now create a std::ostream which uses the wrapper to send data to
// the original std::ostringstream
std::ostream normal_ostream(&wrapper);
normal_ostream << "test\n";
// Flush the stream to invoke the conversion
normal_ostream << std::flush;
// Check the invocation_counter
std::cout << "invocation_counter after wrapping std::ostringstream with "
"std::wbuffer_convert = "
<< invocation_counter << "\n";
Together with the complete example here, the output would be:
invocation_counter start of test1 = 0
invocation_counter after std::ofstream = 1
> test printed to std::cout
invocation_counter after std::cout = 2
invocation_counter after std::ostringstream (should not have changed)= 2
ic after test1 = 2
invocation_counter after std::ostringstream with std::wstring_convert = 3
ic after test2 = 3
invocation_counter after wrapping std::ostringstream with std::wbuffer_convert = 4
ic after test3 = 4
std::codecvt
was intended for converting between disk and memory representation. That is why the std::codecvt
implementation is only called with streams using an underlying filebuf
such as std::ofstream
and std::cout
.
However, a stream using an underlying stringbuf
can be wrapped using std::wbuffer_convert
into a std::ostream
instance which would then invoke the underlying std::codecvt
.