popenvfork

popen() implementation,fd leaks,and vfork()


In the glibc implementation of popen(), it specifies that

The popen() function shall ensure that any streams from previous popen() calls that remain open in the parent process are closed in the new child process.

Why? If the purpose is to avoid fd leaks, why not just close all open fds?

The glibc implementation of popen() uses fork(). Although there are dup2() and close() calls between fork() and exec(), is it possible to replace fork() by vfork() to improve performance?

Is the Linux implementation of popen() based on fork() rather than vfork()? Why (or why not)?

I'm going to write a bidirectional version of popen(), which returns two FILE*: one for read and one for write. How do I implement it correctly? It should be thread safe and no fd leaks. It is better if it is fast.


Solution

  • vfork(2) is obsolete (removed from POSIX2008), and fork(2) is quite efficient, since it uses copy-on-write techniques.

    popen(3) cannot close all opened files, because it does not know them and cannot know which are relevant. Imagine a program which gets a socket and pass its file descriptor as an argument to the popen-ed command (or simply popen("time cat /etc/issue >&9; date","r")....). See also fcntl(2) with FD_CLOEXEC, open(2) with O_CLOEXEC, execve(2)

    File descriptors are program-wide and process-wide scare resources and it is your responsability to manage them correctly. You should know which fd-s should be closed in your child process before execve. If you know what program is execve-d and what fds it needs, you can close all other fds (or most of them, perhaps with for (int i=STDERR_FILENO+1; i<64; i++) (void) close(i);) before execve.

    If you are coding a reusable library, document its policy regarding file descriptors (and any other global process-wide resources) and probably use FD_CLOEXEC on any file descriptors it is obtaining itself (not as explicit argument or data), e.g. for internal use.

    It looks like you are reinventing p2open (then you probably need to understand the implementation details of your FILE in your C standard library, or else use fdopen(3) with care and caution); you might find some implementation of it. Beware, the process using that probably need to have some event loop (e.g. above poll(2) ...) to avoid a potential deadlock (with both parent and child processes blocked on reading).

    Did you consider using some existing event loop infrastructure (e.g. libevent, libev, glib from GTK, etc....)?

    BTW Linux has several free software implementations for its C standard library. The GNU libc is quite common, but there is musl-libc and several others. Study the source code of your libc.