filesystemsc++17gcc9

Filename c_str() corruption when using directory_iterator


When going over all files in a directory with directory_iterator storing the c_str() name of a file before using it leads invalid reads (and garbage output).

This seems quite odd to me.

Code examples:

Working:

#include <iostream>
#include <filesystem>

namespace fs = std::filesystem;

int main() {
 for (auto const &entry : fs::directory_iterator("./")) {
   std::cout << entry.path().filename().c_str() << '\n';
 }
}

valgrind reports no errors.

Corrupt output:

#include <iostream>
#include <filesystem>

namespace fs = std::filesystem;

int main() {
 for (auto const &entry : fs::directory_iterator("./")) {
   auto filename = entry.path().filename().c_str();
   std::cout << filename << '\n';
 }
}

valgrind reports 159 invalid reads (of size 1) -- the exact number depends on how many files are in the directory.

Both these snippets have been compiled with gcc 9.1 using the command: g++-9.1 test.cpp -std=c++17


Solution

  • The lifetime of a temporary object is scoped to the statement it was created in. Informally speaking, a statement is a line of code that ends at a semicolon. All temporaries stay live until the end of the entire statement.

    From the C++ spec:

    When an implementation introduces a temporary object of a class that has a non-trivial constructor ([class.default.ctor], [class.copy.ctor]), it shall ensure that a constructor is called for the temporary object. Similarly, the destructor shall be called for a temporary with a non-trivial destructor ([class.dtor]). Temporary objects are destroyed as the last step in evaluating the full-expression ([intro.execution]) that (lexically) contains the point where they were created. This is true even if that evaluation ends in throwing an exception. The value computations and side effects of destroying a temporary object are associated only with the full-expression, not with any specific subexpression.

    Dissecting the working example, we see that operator<< executes before destruction of temporaries.

    Dissecting the broken example, we see a dangling pointer:

    You can pull out a local variable without corruption by removing the .c_str(), which makes variable filename an object of type std::filesystem::path. std::filesystem::path owns its memory (similar to std::string).

    for (auto const &entry : fs::directory_iterator("./")) {
        auto filename = entry.path().filename();
        std::cout << filename << '\n';
    }
    

    path also supports ostream output directly, without the need for .c_str().