In Chapter 4 of the book "Advanced Programming in the Unix Environment," which covers files and directories, there is a code sample which aims to be like the ftw
command and traverse a file hierarchy. It uses a pointer to an absolute file path, as well as a recursive function with a callback to traverse the directory, using calls to opendir()
and readdir()
in the process.
There is an exercise in which readers are asked to use chdir()
and file names instead of using the absolute paths to accomplish the same task and to compare the times of the two programs. I wrote a program using chdir()
and did not notice a difference in the time. Is this expected? I would have thought that the additional call to chdir()
would add some overhead. Is it maybe a relatively trivial call? Any insight would be appreciated.
Here's the recursive function using absolute paths:
static int /* we return whatever func() returns */
dopath(Myfunc* func)
{
struct stat statbuf;
struct dirent *dirp;
DIR *dp;
int ret;
char *ptr;
if (lstat(fullpath, &statbuf) < 0) /* stat error */
return(func(fullpath, &statbuf, FTW_NS));
if (S_ISDIR(statbuf.st_mode) == 0) /* not a directory */
return(func(fullpath, &statbuf, FTW_F));
/*
* It's a directory. First call func() for the directory,
* then process each filename in the directory.
*/
if ((ret = func(fullpath, &statbuf, FTW_D)) != 0)
return(ret);
ptr = fullpath + strlen(fullpath); /* point to end of fullpath */
*ptr++ = '/';
*ptr = 0;
if ((dp = opendir(fullpath)) == NULL) /* can't read directory */
return(func(fullpath, &statbuf, FTW_DNR));
while ((dirp = readdir(dp)) != NULL) {
if (strcmp(dirp->d_name, ".") == 0 ||
strcmp(dirp->d_name, "..") == 0)
continue; /* ignore dot and dot-dot */
strcpy(ptr, dirp->d_name); /* append name after slash */
if ((ret = dopath(func)) != 0) /* recursive */
break; /* time to leave */
}
ptr[-1] = 0; /* erase everything from slash onwards */
if (closedir(dp) < 0)
err_ret("can't close directory %s", fullpath);
return(ret);
}
And here's the function with my changes:
static int /* we return whatever func() returns */
dopath(Myfunc* func, char* path)
{
struct stat statbuf;
struct dirent *dirp;
DIR *dp;
int ret;
if (lstat(path, &statbuf) < 0) /* stat error */
return(func(path, &statbuf, FTW_NS));
if (S_ISDIR(statbuf.st_mode) == 0) /* not a directory */
return(func(path, &statbuf, FTW_F));
/*
* It's a directory. First call func() for the directory,
* then process each filename in the directory.
*/
if ((ret = func(path, &statbuf, FTW_D)) != 0)
return(ret);
if ( chdir(path) < 0 )
return(func(path, &statbuf, FTW_DNR));
if ((dp = opendir(".")) == NULL) /* can't read directory */
return(func(path, &statbuf, FTW_DNR));
while ((dirp = readdir(dp)) != NULL) {
if (strcmp(dirp->d_name, ".") == 0 ||
strcmp(dirp->d_name, "..") == 0)
continue; /* ignore dot and dot-dot */
if ((ret = dopath(func, dirp->d_name)) != 0) /* recursive */
break; /* time to leave */
}
if ( chdir("..") < 0 )
err_ret("can't go up directory");
if (closedir(dp) < 0)
err_ret("can't close directory %s", fullpath);
return(ret);
}
I don't think you should expect a substantial time performance difference between the absolute path version and the chdir()
version. Rather, the pros and cons of both versions are as follows:
PATH_MAX
. The chdir()
version does not have this problem.chdir()
version manipulates the pwd, which is generally considered bad practice if you can avoid it: it's not thread-safe, and the end user might expect it to be left alone. For example filenames given on the command line and used by a different part of the program might be relative to what the user thought the pwd was, which breaks when you change it.chdir()
version might go out of control when backing up to a higher directory (chdir("..")
) if special care is not taken and the directory structure changes while it is being traversed. Then again the full pathname version might break in a different way under these circumstances...The openat()
family of functions available on modern POSIX systems offer the best of both worlds. If these functions are available, openat()
together with fdopendir()
, fstatat()
, etc... make for a really nice implementation of directory walking.