The POSIX function S_ISDIR is occasionally lying to me.
It's telling me that a directory exists, when it clearly doesn't.
Here is a small program that illustrates the problem:
#include <sys/stat.h>
#include <dirent.h>
#include <string>
#include <iostream>
#include <iomanip>
bool Is_Directory(const char* path_to_file){
struct stat fileInfo;
std::cout << lstat(path_to_file, &fileInfo) << " ";
return S_ISDIR(fileInfo.st_mode);
}
int main(){
std::cout << std::boolalpha;
std::cout << Is_Directory("folder") << '\n';
std::cout << Is_Directory("folder") << '\n';
std::cout << Is_Directory("folder") << '\n';
std::cout << Is_Directory("folder") << '\n';
std::cout << Is_Directory("folder") << '\n';
std::cout << Is_Directory("folder") << '\n';
}
If I run this program (a lot of times), very quickly, I will see the following output:
$./main
-1 false
-1 false
-1 false
-1 false
-1 false
-1 false
$./main
-1 false
-1 false
-1 true
-1 true
-1 true
-1 true
$./main
-1 false
-1 false
-1 false
-1 false
-1 false
-1 false
See how the function suddenly returns true
, even though the directory doesn't exist.
What's strange though, is that if I put the program in an infinite loop of checking, it will continue to say that the directory does not exist. It's only by running the program again and again in rapid succession do I spot the issue.
Here is what I've tried so far:
check the code: The code doesn't seem wrong.
Macro: int S_ISDIR (mode_t m)
This macro returns non-zero if the file is a directory.
The error code of lstat
is always -1 so I don't think there is an occasional error populating stat.
read documentation:
I saw the following documentation on lstat
:
lstat() is identical to stat(), except that if pathname is a symbolic
link, then it returns information about the link itself, not the file
that it refers to.
I don't exactly understand the implications of this, but maybe it relates to my issue?
So I decided to use regular stat()
instead, and I still see the same problem.
different compilers:
I've tried two different compilers with warnings and sanitizers.
g++
and clang++
. Both exhibit the same problem.
does it need compiled with a C compiler?
I re-wrote the program in vanilla C (but still compiled it with g++/clang++).
#include <sys/stat.h>
#include <dirent.h>
#include <stdio.h>
bool Is_Directory(const char* path_to_file){
struct stat fileInfo;
printf("%d ",lstat(path_to_file, &fileInfo));
return S_ISDIR(fileInfo.st_mode);
}
int main(){
printf("%d\n",Is_Directory("folder"));
printf("%d\n",Is_Directory("folder"));
printf("%d\n",Is_Directory("folder"));
printf("%d\n",Is_Directory("folder"));
printf("%d\n",Is_Directory("folder"));
printf("%d\n",Is_Directory("folder"));
}
All of a sudden, the problem is gone. I start the program again and again quickly, but it always correctly reports that the directory is not there.
I switch back to the C++ code, and run my test again. Sure enough, occasional false positives.
Is it a system header?
I put the C++ headers into the C version. Program still works without problems.
Is it std::cout?
Maybe std::cout
is slower, and that's why I'm seeing the problem... or maybe its completely unrelated. Maybe using std::cout
indirectly keeps something in the binary that's causing the problem. Or is std::cout
doing something globally to my program's environment?
I'm shooting in the dark here.
I tried the following:
#include <sys/stat.h>
#include <dirent.h>
#include <stdio.h>
#include <iostream>
bool Is_Directory(const char* path_to_file){
struct stat fileInfo;
printf("%d ",lstat(path_to_file, &fileInfo));
return S_ISDIR(fileInfo.st_mode);
}
int main(){
std::cout << "test" << std::endl;
printf("%d\n",Is_Directory("folder"));
printf("%d\n",Is_Directory("folder"));
printf("%d\n",Is_Directory("folder"));
printf("%d\n",Is_Directory("folder"));
printf("%d\n",Is_Directory("folder"));
printf("%d\n",Is_Directory("folder"));
}
Aha!
$./main
test
-1 0
-1 0
-1 0
-1 0
-1 0
-1 0
$./main
test
-1 0
-1 0
-1 0
-1 0
-1 0
-1 0
$./main
test
-1 1
-1 0
-1 0
-1 0
-1 0
-1 0
$./main
test
-1 0
-1 0
-1 0
-1 0
-1 0
-1 0
$./main
test
-1 1
-1 0
-1 0
-1 0
-1 0
-1 0
$./main
test
-1 0
-1 0
-1 0
-1 0
-1 0
-1 0
Now its only the first check that sometimes returns true.
It's like std::cout
is somehow messing up S_ISDIR
, but after S_ISDIR
is called, it does not mess up the next call to S_ISDIR
.
investigate source:
I found the source code for S_ISDIR
in /usr/include/sys
:
/* Test macros for file types. */
#define __S_ISTYPE(mode, mask) (((mode) & __S_IFMT) == (mask))
#define S_ISDIR(mode) __S_ISTYPE((mode), __S_IFDIR)
S_ISDIR
seems to be nothing but a helper, and whether or not the directory exists, has already been decided from stat()
. (Again, I've tried both stat
and lstat
. Am I suppose to be using fstat
? I don't think so. I've found other examples online where people are using
S_ISDIR
in the same way as my example code).
Again, it doesn't show the symptoms when I put the code into an infinite loop of both checking and printing with std::cout
. Which leads me to believe the problem only occurs at the start of the program, but I guess that doesn't seem true either, because if you look at my original output, it went:
$./main
-1 false
-1 false
-1 true
-1 true
-1 true
-1 true
operating system / hard drive / system libraries / compilers:
Is there something wrong with my machine?
No, I'm on Ubuntu 16.04.1 LTS
. I went and tried this on a different machine CentOS 6.5
with an older version of g++
. Same results.
So my code is just bad.
isolate the problem:
I've simplified the issue.
This program will sometimes return an error.
#include <sys/stat.h>
#include <iostream>
bool Is_Directory(const char* path_to_file){
struct stat fileInfo;
stat(path_to_file, &fileInfo);
return S_ISDIR(fileInfo.st_mode);
}
int main(){
std::cout << std::endl;
return Is_Directory("folder");
}
This program will never return an error.
#include <sys/stat.h>
#include <iostream>
bool Is_Directory(const char* path_to_file){
struct stat fileInfo;
stat(path_to_file, &fileInfo);
return S_ISDIR(fileInfo.st_mode);
}
int main(){
return Is_Directory("folder");
}
Why would flushing a buffer result in a directory sometimes existing?
Actually, if I only flush the buffer, the problem goes away.
This program will never return an error.
#include <sys/stat.h>
#include <iostream>
bool Is_Directory(const char* path_to_file){
struct stat fileInfo;
stat(path_to_file, &fileInfo);
return S_ISDIR(fileInfo.st_mode);
}
int main(){
std::cout.flush();
return Is_Directory("folder");
}
Well, that's probably because it had nothing to flush.
As long as I flush at least one character, we have our problem again.
Here is the real MCVE:
#include <sys/stat.h>
#include <iostream>
int main(){
std::cout << std::endl;
struct stat fileInfo;
stat("f", &fileInfo);
return S_ISDIR(fileInfo.st_mode);
}
Again, an infinite loop does not work.
This program will never return (assuming it gets lucky on the first try):
#include <sys/stat.h>
#include <iostream>
int main(){
while (true){
std::cout << std::endl;
struct stat fileInfo;
stat("f", &fileInfo);
if(S_ISDIR(fileInfo.st_mode)) return 0;
}
}
So the problem arises when restarting processes as well as flushing?
I dumped the assembly, but it doesn't mean much to me.
g++ -std=c++1z -g -c a.cpp
objdump -d -M intel -S a.o > a.s
a.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <main>:
#include <sys/stat.h>
#include <iostream>
int main(){
0: 55 push rbp
1: 48 89 e5 mov rbp,rsp
4: 48 81 ec a0 00 00 00 sub rsp,0xa0
b: 64 48 8b 04 25 28 00 mov rax,QWORD PTR fs:0x28
12: 00 00
14: 48 89 45 f8 mov QWORD PTR [rbp-0x8],rax
18: 31 c0 xor eax,eax
std::cout << std::endl;
1a: be 00 00 00 00 mov esi,0x0
1f: bf 00 00 00 00 mov edi,0x0
24: e8 00 00 00 00 call 29 <main+0x29>
struct stat fileInfo;
stat("f", &fileInfo);
29: 48 8d 85 60 ff ff ff lea rax,[rbp-0xa0]
30: 48 89 c6 mov rsi,rax
33: bf 00 00 00 00 mov edi,0x0
38: e8 00 00 00 00 call 3d <main+0x3d>
return S_ISDIR(fileInfo.st_mode);
3d: 8b 85 78 ff ff ff mov eax,DWORD PTR [rbp-0x88]
43: 25 00 f0 00 00 and eax,0xf000
48: 3d 00 40 00 00 cmp eax,0x4000
4d: 0f 94 c0 sete al
50: 0f b6 c0 movzx eax,al
53: 48 8b 55 f8 mov rdx,QWORD PTR [rbp-0x8]
57: 64 48 33 14 25 28 00 xor rdx,QWORD PTR fs:0x28
5e: 00 00
60: 74 05 je 67 <main+0x67>
62: e8 00 00 00 00 call 67 <main+0x67>
67: c9 leave
68: c3 ret
0000000000000069 <_Z41__static_initialization_and_destruction_0ii>:
69: 55 push rbp
6a: 48 89 e5 mov rbp,rsp
6d: 48 83 ec 10 sub rsp,0x10
71: 89 7d fc mov DWORD PTR [rbp-0x4],edi
74: 89 75 f8 mov DWORD PTR [rbp-0x8],esi
77: 83 7d fc 01 cmp DWORD PTR [rbp-0x4],0x1
7b: 75 27 jne a4 <_Z41__static_initialization_and_destruction_0ii+0x3b>
7d: 81 7d f8 ff ff 00 00 cmp DWORD PTR [rbp-0x8],0xffff
84: 75 1e jne a4 <_Z41__static_initialization_and_destruction_0ii+0x3b>
extern wostream wclog; /// Linked to standard error (buffered)
#endif
//@}
// For construction of filebuffers for cout, cin, cerr, clog et. al.
static ios_base::Init __ioinit;
86: bf 00 00 00 00 mov edi,0x0
8b: e8 00 00 00 00 call 90 <_Z41__static_initialization_and_destruction_0ii+0x27>
90: ba 00 00 00 00 mov edx,0x0
95: be 00 00 00 00 mov esi,0x0
9a: bf 00 00 00 00 mov edi,0x0
9f: e8 00 00 00 00 call a4 <_Z41__static_initialization_and_destruction_0ii+0x3b>
a4: 90 nop
a5: c9 leave
a6: c3 ret
00000000000000a7 <_GLOBAL__sub_I_main>:
a7: 55 push rbp
a8: 48 89 e5 mov rbp,rsp
ab: be ff ff 00 00 mov esi,0xffff
b0: bf 01 00 00 00 mov edi,0x1
b5: e8 af ff ff ff call 69 <_Z41__static_initialization_and_destruction_0ii>
ba: 5d pop rbp
bb: c3 ret
I tried following the stat source code, but got rather lost.
The C++ source code was a little easier to follow. Here is the flush function from /bits/ostream.tcc
:
template<typename _CharT, typename _Traits>
basic_ostream<_CharT, _Traits>&
basic_ostream<_CharT, _Traits>::
flush()
{
// _GLIBCXX_RESOLVE_LIB_DEFECTS
// DR 60. What is a formatted input function?
// basic_ostream::flush() is *not* an unformatted output function.
ios_base::iostate __err = ios_base::goodbit;
__try
{
if (this->rdbuf() && this->rdbuf()->pubsync() == -1)
__err |= ios_base::badbit;
}
__catch(__cxxabiv1::__forced_unwind&)
{
this->_M_setstate(ios_base::badbit);
__throw_exception_again;
}
__catch(...)
{ this->_M_setstate(ios_base::badbit); }
if (__err)
this->setstate(__err);
return *this;
}
It seems to call pubsync()
which lead me to a sync()
method in /ext/stdio_sync_filebuf.h
:
sync()
{ return std::fflush(_M_file); }
virtual std::streampos
seekoff(std::streamoff __off, std::ios_base::seekdir __dir,
std::ios_base::openmode = std::ios_base::in | std::ios_base::out)
{
std::streampos __ret(std::streamoff(-1));
int __whence;
if (__dir == std::ios_base::beg)
__whence = SEEK_SET;
else if (__dir == std::ios_base::cur)
__whence = SEEK_CUR;
else
__whence = SEEK_END;
#ifdef _GLIBCXX_USE_LFS
if (!fseeko64(_M_file, __off, __whence))
__ret = std::streampos(ftello64(_M_file));
#else
if (!fseek(_M_file, __off, __whence))
__ret = std::streampos(std::ftell(_M_file));
#endif
return __ret;
}
virtual std::streampos
seekpos(std::streampos __pos,
std::ios_base::openmode __mode =
std::ios_base::in | std::ios_base::out)
{ return seekoff(std::streamoff(__pos), std::ios_base::beg, __mode); }
}; sync()
{ return std::fflush(_M_file); }
virtual std::streampos
seekoff(std::streamoff __off, std::ios_base::seekdir __dir,
std::ios_base::openmode = std::ios_base::in | std::ios_base::out)
{
std::streampos __ret(std::streamoff(-1));
int __whence;
if (__dir == std::ios_base::beg)
__whence = SEEK_SET;
else if (__dir == std::ios_base::cur)
__whence = SEEK_CUR;
else
__whence = SEEK_END;
#ifdef _GLIBCXX_USE_LFS
if (!fseeko64(_M_file, __off, __whence))
__ret = std::streampos(ftello64(_M_file));
#else
if (!fseek(_M_file, __off, __whence))
__ret = std::streampos(std::ftell(_M_file));
#endif
return __ret;
}
virtual std::streampos
seekpos(std::streampos __pos,
std::ios_base::openmode __mode =
std::ios_base::in | std::ios_base::out)
{ return seekoff(std::streamoff(__pos), std::ios_base::beg, __mode); }
};
As far as I can tell, C++ is farming the work out to std::fflush
.
After doing some more tests, I've discovered that
fflush()
from <iostream>
exhibits the problem, but fflush()
from <stdio.h>
does not.
I attempted to trace backward from fflush()
, but I think I hit the source code boundary.
This function is a possible cancellation point and therefore not
marked with __THROW. */
extern int fflush (FILE *__stream);
__END_NAMESPACE_STD
#ifdef __USE_MISC
/* Faster versions when locking is not required.
This function is not part of POSIX and therefore no official
cancellation point. But due to similarity with an POSIX interface
or due to the implementation it is a cancellation point and
therefore not marked with __THROW. */
extern int fflush_unlocked (FILE *__stream);
#endif
So it must be what I'm linking with?
//exhibits the problem
#include <sys/stat.h>
#include <iostream>
int main(){
printf("\n");fflush(stdout);
struct stat fileInfo;
stat("f", &fileInfo);
return S_ISDIR(fileInfo.st_mode);
}
g++ -std=c++11 -o main a.cpp
ldd main
linux-vdso.so.1 => (0x00007ffdc878e000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f1300c00000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f1300837000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f130052d000)
/lib64/ld-linux-x86-64.so.2 (0x000055bace4bc000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f1300316000)
//works correctly
#include <sys/stat.h>
#include <stdio.h>
int main(){
printf("\n");fflush(stdout);
struct stat fileInfo;
stat("f", &fileInfo);
return S_ISDIR(fileInfo.st_mode);
}
g++ -std=c++11 -o main a.cpp
ldd main
linux-vdso.so.1 => (0x00007ffd57f7c000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f482dc6c000)
/lib64/ld-linux-x86-64.so.2 (0x000055828633a000)
I assume libstdc++.so.6
is not suitable for using S_ISDIR
, but libc.so.6
is? Should I build the code that uses S_ISDIR
separately and then link it with the C++ code? How would I be able to detect a problem like this sooner? I still don't understand what's happening. Am I trampling/observing the wrong memory because I linked the wrong libraries? How would you go about resolving this?
You can only analyze the struct stat
data set by lstat()
if the system call succeeds. If it fails, it returns -1
(and it has probably not modified the data in fileInfo
at all — though the values are indeterminate). What you get in fileInfo.st_mode
is garbage because the lstat()
fails — it can return true or false for S_ISDIR()
at whim.
Thus, your first example shows that lstat()
fails every time, so any analysis of the struct stat
is futile; it hasn't been set to any determinate value, and any result is OK.
The same argument applies to all the example code, I believe.
The difference between stat()
and lstat()
is that if the name provided is a symlink, the stat()
system call refers to the file system object at the far end of the symlink (assuming there is one; it fails if the symlink points to a non-existent object), whereas the lstat()
system call refers to the symlink itself. When the name is not a symlink, the two calls return the same information.