c++libstdc++string-viewistream-iterator

Why doesn't std::istream_iterator< std::string_view > compile?


Why can't GCC and Clang compile the code snippet below (link)? I want to return a vector of std::string_views but apparently there is no way of extracting string_views from the stringstream.

#include <iostream>
#include <sstream>
#include <string>
#include <string_view>
#include <vector>
#include <iterator>
#include <algorithm>
#include <ranges>


[[ nodiscard ]] std::vector< std::string_view >
tokenize( const std::string_view inputStr, const size_t expectedTokenCount )
{
    std::vector< std::string_view > foundTokens { };

    if ( inputStr.empty( ) ) [[ unlikely ]]
    {
        return foundTokens;
    }

    std::stringstream ss;
    ss << inputStr;

    foundTokens.reserve( expectedTokenCount );

    std::copy( std::istream_iterator< std::string_view >{ ss }, // does not compile
               std::istream_iterator< std::string_view >{ },
               std::back_inserter( foundTokens ) );

    return foundTokens;
}

int main( )
{
    using std::string_view_literals::operator""sv;
    constexpr auto text { "Today is a nice day."sv };

    const auto tokens { tokenize( text, 4 ) };

    std::cout << tokens.size( ) << '\n';
    std::ranges::copy( tokens, std::ostream_iterator< std::string_view >{ std::cout, "\n" } );
}

Note that replacing select instances of string_view with string lets the code compile.


Solution

  • Because there is no operator >> on std::stringstream and std::string_view (and std::istream_iterator requires this operator).

    As @tkausl points out in the comments, it's not possible for >> to work on std::string_view because it's not clear who would own the memory pointed to by the std::string_view.

    In the case of your program, ss << inputStr copies the characters from inputStr into ss, and when ss goes out of scope its memory would be freed.


    Here is a possible implementation that uses C++20's std::ranges::views::split instead of std::stringstream. It only supports a single space as the delimiter.

    #include <iostream>
    #include <sstream>
    #include <string>
    #include <string_view>
    #include <vector>
    #include <iterator>
    #include <algorithm>
    #include <ranges>
    
    
    [[ nodiscard ]] std::vector< std::string_view >
    tokenize( const std::string_view inputStr, const size_t expectedTokenCount )
    {
        constexpr std::string_view delim { " " };
    
        std::vector< std::string_view > foundTokens { };
    
        if ( inputStr.empty( ) ) [[ unlikely ]]
        {
            return foundTokens;
        }
    
        foundTokens.reserve( expectedTokenCount );
        for ( const auto token : std::views::split( inputStr, delim ) )
        {
            foundTokens.emplace_back( token.begin( ), token.end( ) );
        }
    
        return foundTokens;
    }
    
    int main( )
    {
        using std::string_view_literals::operator""sv;
        constexpr auto text { "Today is a nice day."sv };
    
        const auto tokens { tokenize( text, 4 ) };
    
        std::cout << tokens.size( ) << '\n';
        std::ranges::copy( tokens, std::ostream_iterator< std::string_view >{ std::cout, "\n" } );
    }
    

    This works with gcc 12.1 (compile with -std=c++20), but it doesn't work with clang 14.0.0 because clang hasn't implemented P2210 yet.