I'm working on a platform running MontaVista Linux 3.1.
I have a C++ application, which for esoteric reasons which I won't go into, has to remount the JFFS2 flash file system quite regularly between read-only and read-write.
When you perform a int mount(...)
call, specified in sys/mount.h
, to set the file system read-write, the jffs2_gcd_mtd0
garbage collector process gets kicked off as you would expect. However, when you repeat the mount
call to go back to read-only, jffs2_gcd_mtd0
gets killed, and becomes a defunct process.
After a few minutes, we end up with a shed load of defunct jffs2_gcd_mtd0
processes, which no matter what we do, we can't get rid of.
I can replicate the problem with the following test app:
int main()
{
while(true)
{
mount("/dev/mtdblock/0", "flash", "", MS_REMOUNT|MS_POSIXACL|MS_ACTIVE|MS_NOUSER|0XEC0000, "");
sleep(1);
mount("/dev/mtdblock/0", "flash", "", MS_RDONLY|MS_REMOUNT|MS_POSIXACL|MS_ACTIVE|MS_NOUSER|0XEC0000, "");
sleep(1);
}
}
I have tried various method to reap the defunct processes: setting signal(SIGCHLD, SIG_IGN)
(doesn't work); calling wait(int)
after the set to read-only (fails, with errno going to 10 - "No child processes"); calling kill(0, SIGCHLD)
(doesn't work).
Am I correct in assuming this is a bug in the mount
implementation we have? Given that this is a bug, how could I remove the defunct processes, and stop the process ID table from filling up?
Some supplementary info: this problem doesn't seem to occur when I run the test app with strace
. Now I'm getting really stumped!
As a workaround, I have found that calling the mount()
command from within a pthread
allows the defunct jffs2_gcd_mtd0
processes to be reaped.
I believe this is working via the following mechanism: when the thread joins, the spawned jffs2_gcd_mtd0
process is left without a parent. It therefore gets inherited by init
, which then gets reaped when it is finished.
If anyone would like to correct/expand on my explanation above, please do!