c++iteratorstd-filesystem

Is there a shared state in std::filesystem::directory_iterator copies?


According to cppreference, std::next will return an iterator to the next element without modifying the argument iterator (see the example in this page). However, when using it on a std::filesystem::directory_iterator it actually modifies the argument so that after the call it points to the next directory entry, as shown by this easy program:

#include <filesystem>
#include <iostream>

int main(){
    std::filesystem::directory_iterator i("./");
    std::cout << i->path() <<std::endl;
    std::next(i);
    std::cout << i->path() <<std::endl;
}

which when run on my home returns:

"./.X11-unix"
"./.ICE-unix"

The same result can be obtained by a slight modification where the iterator is explicitly copied and just the copy is advanced:

#include <filesystem>
#include <iostream>

int main(){
    std::filesystem::directory_iterator i("./");
    std::cout << i->path() <<std::endl;
    auto i2 = i;
    ++i2;
    std::cout << i->path() <<std::endl;
}

This produces the same result as above, which really puzzles me. It behaves like there is a common state shared between all the copies of a directory_iterator, but I didn't find anything related in the above linked cppreference page (maybe I overlooked something?).

I did my tests using GCC 14.2.1 on Archlinux.


Solution

  • Pretty much every os provides a file listing API of the form of:

    handle = findFirst(path);
    do
    {
      // Get file properties from handle
    } while (findNext(handle));
    

    As there's only a single handle for the file listing, std::filesystem::directory_iterator pretty much has to be implemented with a shared state.

    See readdir for POSIX and FindFirstFile for windows.

    Note that this is implied in LegacyInputIterator:

    A LegacyInputIterator is a LegacyIterator that can read from the pointed-to element. LegacyInputIterators only guarantee validity for single pass algorithms: once a LegacyInputIterator i has been incremented, all copies of its previous value may be invalidated

    If you want to iterate arbitrarily over the listing results you'll need to store the results in a container then iterate over that.