cforkvfork

fork vs vfork functionality in a C program


I am doing some C exercise for self-learning, and came across the following problem:

Part a:

int main(int argc, char **argv) {                                         

    int a = 5, b = 8;                                                     
    int v;                                                                

    v = fork();                                                           
    if(v == 0) {                                                          
        // 10                                                             
        a = a + 5;                                                        
        // 10                                                             
        b = b + 2;                                                        
        exit(0);                                                          
    }                                                                     
    // Parent code                                                        
    wait(NULL);                                                           
    printf("Value of v is %d.\n", v); // line a                           
    printf("Sum is %d.\n", a + b); // line b                              
    exit(0);                                                              

} 

Part b:

int main(int argc, char **argv) {                                         

    int a = 5, b = 8;                                                     
    int v;                                                                

    v = vfork();                                                               
    if(v == 0) {                                                          
        // a = 10                                                         
        a = a + 5;                                                        
        // b = 6                                                          
        b = b - 2;                                                        
        exit(0);                                                          
    }                                                                     
    // Parent code                                                        
    wait(NULL);                                                           
    printf("Value of v is %d.\n", v); // line a                           
    printf("Sum is %d.\n", a + b); // line b                              
    exit(0);                                                              

}

We have to compare the outputs of line a and line b.

The outputs of part a is:

Value of v is 79525.
Sum is 13.

The outputs of part b is:

Value of v is 79517.
Sum is 16.

It appears in part a, the sum is the sum of the initial declaration of a and b, whereas in part b, the sum include the summation within the child process.

My question is - why is this happening?

According to this post:

The basic difference between the two is that when a new process is created with vfork(), the parent process is temporarily suspended, and the child process might borrow the parent's address space. This strange state of affairs continues until the child process either exits, or calls execve(), at which point the parent process continues.

The definition of parent process is temporarily suspended doesn't make much sense to me. Does this mean that for 1b, the program waits until the child process to finish running (hence why the child process variables get summed) before the parent runs?

The problem statement also assumes that "the process ID of the parent process maintained by the kernel is 2500, and that the new processes are created by the operating system before the child process is created."

By this definition, what would the value of v for both programs be?


Solution

  • the parent process is temporarily suspended

    Basically, the parent process will not run until the child calls either _exit or one of the exec functions. In your example this means the child will run and therefore perform the summation before the parent runs and does the prints.

    As for:

    My question is - why is this happening?

    First, your part b has undefined behavior because you are violating the vfork semantics. Undefined behavior for a program means the program will not behave in a predictable manner. See this SO post on undefined behavior for more details (it includes some C++ but most of the ideas are the same). From the POSIX specs on vfork:

    The vfork() function has the same effect as fork(2), except that the behavior is undefined if the process created by vfork() either modifies any data other than a variable of type pid_t used to store the return value from vfork(), or returns from the function in which vfork() was called, or calls any other function before successfully calling _exit(2) or one of the exec(3) family of functions.

    So your part b could really do anything. However, you will probably see a somewhat consistent output from part b. This is because when you use vfork you are not creating a new address space. Instead, the child process basically "borrows" the address space of the parent, usually with the intent that it will call one of the exec functions and create a new program image. Instead in your part b you are using the parent address space. Basically, after the child has called exit (which is also invalid as it should call _exit) a most likely will equal 10 and b will most likely equal 6 in the parent. Therefore, the summation is 16 as shown in part b. I say most likely because as stated before this program has undefined behavior.

    For part a where fork is used the child gets its own address space and its modifications are not seen in the parent, therefore the value printed is 13 (5 + 8).

    Finally with regards to the value of v, this is seems just to be something the question is stating to make the output it is showing make sense. The value of v could be any valid value returned by vfork or fork and does not have to be limited to 2500.