When going over all files in a directory with directory_iterator
storing the c_str()
name of a file before using it leads invalid reads (and garbage output).
This seems quite odd to me.
Code examples:
Working:
#include <iostream>
#include <filesystem>
namespace fs = std::filesystem;
int main() {
for (auto const &entry : fs::directory_iterator("./")) {
std::cout << entry.path().filename().c_str() << '\n';
}
}
valgrind reports no errors.
Corrupt output:
#include <iostream>
#include <filesystem>
namespace fs = std::filesystem;
int main() {
for (auto const &entry : fs::directory_iterator("./")) {
auto filename = entry.path().filename().c_str();
std::cout << filename << '\n';
}
}
valgrind reports 159 invalid reads (of size 1) -- the exact number depends on how many files are in the directory.
Both these snippets have been compiled with gcc 9.1 using the command:
g++-9.1 test.cpp -std=c++17
The lifetime of a temporary object is scoped to the statement it was created in. Informally speaking, a statement is a line of code that ends at a semicolon. All temporaries stay live until the end of the entire statement.
When an implementation introduces a temporary object of a class that has a non-trivial constructor ([class.default.ctor], [class.copy.ctor]), it shall ensure that a constructor is called for the temporary object. Similarly, the destructor shall be called for a temporary with a non-trivial destructor ([class.dtor]). Temporary objects are destroyed as the last step in evaluating the full-expression ([intro.execution]) that (lexically) contains the point where they were created. This is true even if that evaluation ends in throwing an exception. The value computations and side effects of destroying a temporary object are associated only with the full-expression, not with any specific subexpression.
Dissecting the working example, we see that operator<<
executes before destruction of temporaries.
entry.path()
= temporary #1.filename()
= temporary #2.c_str()
gets the character pointer out of temporary #2.c_str()
passed into operator<<
of std::cout
while all of the above are still liveoperator<<
taking the .c_str()
pointer is executed and returns.operator<<
taking '\n'
is executed and returns.Dissecting the broken example, we see a dangling pointer:
entry.path()
= temporary #1.filename()
= temporary #2.c_str()
gets the character pointer out of temporary #2 and stored in variable filename
filename
points at deleted memory -- it is a dangling pointer.operator<<
is passed a dangling pointer, which it dereferences as if it were a valid string = undefined behavior.You can pull out a local variable without corruption by removing the .c_str()
, which makes variable filename
an object of type std::filesystem::path
. std::filesystem::path
owns its memory (similar to std::string
).
for (auto const &entry : fs::directory_iterator("./")) {
auto filename = entry.path().filename();
std::cout << filename << '\n';
}
path
also supports ostream
output directly, without the need for .c_str()
.