If you have a std::list
, and want to rebuild it into a new std::list
in a different order, without copying, from iterators into the original std::list
, it's fairly straightforward, for example, if you're shuffling the iterators and rebuilding from the shuffled order:
int main(int argc, char **argv)
{
std::list<std::string> args(&argv[1], &argv[argc]);
std::vector<std::list<std::string>::const_iterator> vec;
for (auto it = std::cbegin(args), last = std::cend(args); it != last; ++it) {
vec.push_back(it);
}
std::shuffle(std::begin(vec), std::end(vec), std::mt19937(42)); // Shuffle with reproducible PRNG
std::cout << "Shuffled vec:\n";
for (auto it : vec) {
std::cout << *it << std::endl;
}
std::list<std::string> shuffled_args;
for (auto it : vec) {
shuffled_args.splice(std::end(shuffled_args), args, it);
}
std::cout << "Shuffled list:\n";
for (const auto& s : shuffled_args) {
std::cout << s << std::endl;
}
return 0;
}
This works great; on my system, when compiled with g++ -std=c++17 -O3 -flto -Wall shuffle_list.cpp
and run with ./a.out a b c d e
, the results from printing both the vector
and the shuffled list
agree (e a c d b
in that order): Try it online!
But when I try to write a variant version using std::forward_list
, things get dicier. This is the only version that doesn't segfault (with comments indicating changes):
int main(int argc, char **argv)
{
std::forward_list<std::string> args(&argv[1], &argv[argc]);
std::vector<std::forward_list<std::string>::const_iterator> vec;
for (auto it = args.cbefore_begin(), last = std::cend(args); std::next(it) != last; ++it) { // Changed to store iterators before each element since splice_after needs them
vec.push_back(it);
}
std::shuffle(std::begin(vec), std::end(vec), std::mt19937(42));
std::cout << "Shuffled vec:\n";
for (auto it : vec) {
std::cout << *std::next(it) << std::endl; // Must advance each iterator to see value stored
}
std::forward_list<std::string> shuffled_args;
auto splice_loc = shuffled_args.cbefore_begin(); // Must begin inserting at beginning of new forward_list
for (auto it : vec) {
shuffled_args.splice_after(splice_loc, args, it); // splice_loc is before when node should go, it points to node before node to splice, great
splice_loc = it; // it should now be last element, it will be next splice location
}
std::cout << "Shuffled list:\n";
for (const auto& s : shuffled_args) {
std::cout << s << std::endl;
}
return 0;
}
The output here is the same from the vector
, but for the resulting forward_list
it just outputs e
: Try it online! If you try other fun things like replacing splice_loc = it;
with ++splice_loc;
(which is logically equivalent; you spliced in a node after the current location, and advancing the iterator should move you to that new node), it segfaults.
I think I know why this is broken (it's two different ways for the full code and the replacement with ++splice_loc
), and I don't think the approach I'm taking is salvageable. In the segfaulting code, as I splice nodes over, the iterators remain valid, but some of the iterators in the original structure are moved before I get to them (e.g. I use position 1's iterator to move the item at 2, and when I try to move 3 via 2, it moves some other random thing that follows 2 in the new list), and now I'm trying to splice in what follows them in the new forward_list
(and violating the API, as I claim they came from args
, when they've in fact been moved to shuffled_args
at that point). There's no good fix for this in the design I've chosen AFAICT. Update: In the non-segfaulting code, I should be saving off std::next(it)
before the splice and assigning that to splice_loc
(by assigning it
, which is probably still in the original forward_list
, the splice_loc
now points to the original list and I'm probably engaging in undefined behavior that ultimately modifies the original list more than the new one).
My question is:
Is there a good way to do what I want, taking iterators from a forward_list
, jumbling them up (with shuffle
or sorting or whatever), then building a new forward_list
in an alternate order, by direct node transfer (no copies or moves of any of the contained elements), and doing so efficiently? The best I can come up with is making a new dedicated forward_list
for each node in the vector
, so I can splice that single node over to the final forward_list
. This feels ugly, and possibly more expensive than the list
equivalent behavior. Is there a better way? Is the many one-element forward_list
solution actually great?
For the curious: This is a stripped down reproducer of a problem I'm having writing a keyed sorting algorithm (as in Schwartzian transform aka decorate-sort-undecorate), entirely as a personal challenge. I'm trying to make a specialized version for std::list
and std::forward_list
that avoids any copies or moves of the values being sorted by decorating the iterators rather than the values themselves, so once I've sorted the collection by the computed keys, I can then construct a new list
or forward_list
by using splice
/splice_after
to rebuild the container in the sorted order, without copying or moving a single one of the contained values.
This is a working version based on my original suggestion to use a bunch of single node std::forward_list
s. It feels inefficient to make all this std::forward_list
s, but from what I can tell, std::forward_list
is spec-ed so narrowly that it's almost guaranteed to be implemented as a wrapper around a single pointer, which should be pretty low overhead. And it does work, with no copies or moves of the contained elements, nor (thanks to the use of deque
) any copies or moves of the forward_list
s themselves (aside from when we empty them to splice them onto the output forward_list
), and it only traverses the input forward_list
once (destroying it by extracting the first node over and over until it's emptied).
It's not the prettiest, but it's not nearly as ugly or inefficient as I was expecting.
int main(int argc, char **argv)
{
std::forward_list<std::string> args(&argv[1], &argv[argc]);
std::deque<std::forward_list<std::string>> deq; // Use deque so we never have to move existing forward_lists, and don't need to pre-walk to find size to reserve for vector
while (!args.empty()) {
auto& flist = deq.emplace_back();
flist.splice_after(flist.cbefore_begin(), args, args.cbefore_begin()); // Extract a single node from input to populate new forward_list
}
std::shuffle(std::begin(deq), std::end(deq), std::mt19937(42)); // Shuffle with reproducible PRNG
std::cout << "Shuffled deq:\n";
for (auto& flist : deq) {
std::cout << flist.front() << std::endl;
}
std::forward_list<std::string> shuffled_args;
auto splice_loc = shuffled_args.cbefore_begin();
for (auto&& flist : deq) {
shuffled_args.splice_after(splice_loc, std::move(flist)); // splice single element forward_list contents onto end
++splice_loc; // We added one element, move the iterator forward by one
}
std::cout << "Shuffled list:\n";
for (const auto& s : shuffled_args) {
std::cout << s << std::endl;
}
return 0;
}
To be clear, if anyone has any better solutions, I'm happy to hear about them and would love to give the checkmark to something cleaner.