c++boostboost-asiostdstringstreambuf

How to get part of a std::string into a streambuf without copying?


I'm using boost asio a lot lately and I find that I'm working with std::strings and asio::streambufs quite a bit. I find that I'm trying to get data back and forth between streambufs and strings a lot as part of parsing network data. In general, I don't want to mess around with 'formatted io', so iostreams aren't very useful. I've found that while ostream::operator<<(), in spite of the official documentation, seems to relay my strings into streambufs unmolested, istream::operator>>() mangles the contents of my streambufs (as you would expect given that it's 'formatted').

It really seems to me like the standard library is missing a whole lot of iterators and stream objects for dealing with streambufs and strings and unformatted io. For example, if I want to get a substring of a string into a streambuf, how do I do that without creating a copy of the string? A basic all-in-all-out transfer can be accomplished like:

// Get a whole string into a streambuf, and then get the whole streambuf back
//  into another string
{
    boost::asio::streambuf sbuf;
    iostream os(&sbuf);
    string message("abcdefghijk lmnopqrs tuvwxyz");
    cout << "message=" << message << endl;
    os << message;
    std::istreambuf_iterator<char> sbit(&sbuf);
    std::istreambuf_iterator<char> end;
    std::string sbuf_it_wholestr(sbit, end);
    cout << "sbuf_it_wholestr=" << sbuf_it_wholestr << endl;    
}

prints:

message=abcdefghijk lmnopqrs tuvwxyz
sbuf_it_wholestr=abcdefghijk lmnopqrs tuvwxyz

If I want to get just part of a streambuf into a string, that seems really hard, because istreambuf_iterator isn't a random access iterator and doesn't support arithmetic:

// Get a whole string into a streambuf, and then get part of the streambuf back
//  into another string. We can't do this because istreambuf_iterator isn't a
//  random access iterator!
{
    boost::asio::streambuf sbuf;
    iostream os(&sbuf);
    string message("abcdefghijk lmnopqrs tuvwxyz");
    cout << "message=" << message << endl;
    os << message;
    std::istreambuf_iterator<char> sbit(&sbuf);
    // This doesn't work
    //std::istreambuf_iterator<char> end = sbit + 7; // Not random access!
    //std::string sbuf_it_partstr(sbit, end);
    //cout << "sbuf_it_partstr=" << sbuf_it_partstr << endl;    
}    

And there doesn't seem to be any way of directly using string::iterators to dump part of a string into a streambuf:

// istreambuf_iterator doesn't work in std::copy either
{
    boost::asio::streambuf sbuf;
    iostream os(&sbuf);
    string message("abcdefghijk lmnopqrs tuvwxyz");
    cout << "message=" << message << endl;
    std::istreambuf_iterator<char> sbit(&sbuf);
    //std::copy(message.begin(), message.begin()+7, sbit); // Doesn't work here
}    

I can always pull partial strings out of a streambuf if I don't mind formatted io, but I do - formatted io is almost never what I want:

// Get a whole string into a streambuf, and then pull it out using an ostream
// using formatted output
{
    boost::asio::streambuf sbuf;
    iostream os(&sbuf);
    string message("abcdefghijk lmnopqrs tuvwxyz");
    cout << "message=" << message << endl;
    string part1, part2;
    os << message;
    os >> part1;
    os >> part2;
    cout << "part1=" << part1 << endl;    
    cout << "part2=" << part2 << endl;    
}

prints:

message=abcdefghijk lmnopqrs tuvwxyz
part1=abcdefghijk
part2=lmnopqrs

If I'm ok with an ugly copy, I can generate a substring, of course - std::string::iterator is random access...

// Get a partial string into a streambuf, and then pull it out using an
//  istreambuf_iterator
{
    boost::asio::streambuf sbuf;
    iostream os(&sbuf);
    string message("abcdefghijk lmnopqrs tuvwxyz");
    cout << "message=" << message << endl;
    string part_message(message.begin(), message.begin()+7);
    os << part_message;
    cout << "part_message=" << part_message << endl;
    std::istreambuf_iterator<char> sbit(&sbuf);
    std::istreambuf_iterator<char> end;
    std::string sbuf_it_wholestr(sbit, end);
    cout << "sbuf_it_wholestr=" << sbuf_it_wholestr << endl;    
}

prints:

message=abcdefghijk lmnopqrs tuvwxyz
part_message=abcdefg
sbuf_it_wholestr=abcdefg

The stdlib also has the curiously stand-alone std::getline(), which lets you pull individual lines out of an ostream:

// If getting lines at a time was what I wanted, that can be accomplished too...          
{    
    boost::asio::streambuf sbuf;
    iostream os(&sbuf);
    string message("abcdefghijk lmnopqrs tuvwxyz\n1234 5678\n");
    cout << "message=" << message << endl;
    os << message;
    string line1, line2;
    std::getline(os, line1);
    std::getline(os, line2);
    cout << "line1=" << line1 << endl;
    cout << "line2=" << line2 << endl;
}

prints: message=abcdefghijk lmnopqrs tuvwxyz 1234 5678

line1=abcdefghijk lmnopqrs tuvwxyz
line2=1234 5678

I feel like there's some Rosetta Stone that I've missed and that dealing with std::string and asio::streambuf would be so much easier if I discovered it. Should a just abandon the std::streambuf interface and make use of asio::mutable_buffer, which I can get out of asio::streambuf::prepare()?


Solution

    1. istream::operator>>() mangles the contents of my streambufs (as you would expect given that it's 'formatted').

      Open your input stream with std::ios::binary flag and manipulate it with is >> std::noskipws

    2. For example, if I want to get a substring of a string into a streambuf, how do I do that without creating a copy of the string? A basic all-in-all-out transfer can be accomplished like

      Try like

       outstream.write(s.begin()+start, length);
      

      Or use boost::string_ref:

       outstream << boost::string_ref(s).instr(start, length);
      

    3. And there doesn't seem to be any way of directly using string::iterators to dump part of a string into a streambuf:

       std::copy(it1, it2, ostreambuf_iterator<char>(os));
      
    4. Re. parsing the message lines:

      You can split into iterator ranges with iter_split.

      You can parse an embedded grammar on the fly with boost::spirit::istream_iterator