Child process not reading from pipe, unless parent calls printf before dup2

The code below forks a child process, and redirects stdout to the pipe. The child is supposed to read from the pipe but it's not happening. Strangely, if the parent is made to call printf at least once before calling dup2, things seem to work. I guess that's a luck not to be relied upon... but an explanation would be still great. More importantly, why is the child not able to read?

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>

int main()
{
    int fd[2]; 
    pid_t p;
   
    if(pipe(fd) == -1) 
        return -1;
    
    if((p=fork()) == -1)
        return -1;

    if(p==0){
        close(fd[1]);
        dup2(fd[0],0);
        fprintf(stderr,"Child starts\n");
        int x;
        scanf("%d",&x);                   // GETS STUCK HERE
        fprintf(stderr,"Child ends\n");  
        exit(0);
    }

    // printf(" ");        // THIS PRINTF SEEMS TO RESOLVE THE ISSUE?!  

    close(fd[0]);
    dup2(fd[1],1);
    printf("3\n");
    fprintf(stderr, "Parent waiting\n");
    wait(0);
    fprintf(stderr, "Parent ends\n");
}

Lots of questions have been asked on fork and pipe but I could not find an answer to my problem. Apologies if it is a duplicate question.

Solution

Parent stdout is not line buffered (because it becomes a pipe). So, the data that printf("3\n"); outputs stays in the stream's buffer and is not flushed to the pipe.

There are two ways to fix this (in the parent):

add setlinebuf(stdout); immediately before that printf
add fflush(stdout); immediately after that printf

UPDATE:

And adding the extra printf() fixes it why? I suppose that at that point, the underlying file descriptor for standard output is still a terminal so it puts the stream into line-buffered mode, and that doesn't change when the underlying file descriptor is changed to a pipe. – Jonathan Leffler

Yes, that is correct. If we change:

printf(" ");

Into:

setlinebuf(stdout);

Then, that also works.

(i.e.) the printf to a real tty [implicitly] set line buffered mode.

The probe/test for output is a tty device is [obviously] deferred until something active is done with the device (e.g. printf)

The dup2 is transparent to the stream so the stream's flags stay the same.

UPDATE:

Is it possible to handle it from the child side?

No, the child has a different memory space from the parent. It starts out as a copy of the parent's memory. But, it is independent. Changes to it do not affect the parent's memory (or vice versa).

More on this below. But, remember that the child is waiting for a number which is a digit string followed by whitespace. The whitespace here is a newline.

For example, what if the parent execv's a different program whose output is to be read by the child? – cobra

We have to be careful with terminology.

"parent/child" is a concept of fork but not execv. So, when doing the execv in the parent we are not talking about the child (or any child). We are talking about the "target program" of the exec.

The target program has the same issue as before.

Streams are a userspace/application concept. The OS kernel has no knowledge of them. The kernel only knows "file descriptors". Things that come from open, pipe, socket, etc.

stdio streams are just a buffering mechanism for I/O. They exist only in a given process's memory. They are an "efficiency" mechanism to prevent excessive/repeated/wasteful syscalls (i.e. write calls) for small amounts of data.

The concept of "line buffering" is just a flag (and the action it takes) inside a stream struct. TTY output devices default to line buffering--the buffer is flushed when it sees a newline.

Other things like files (or pipes ;-) default to "standard" buffering (e.g. 4096 bytes). They are flushed when the 4096 is exceeded. "Flushing" means that the stdio layer writes to the underlying file descriptor (via the write syscall).

When an execv is done, the memory is completely replaced and loaded from the target program's executable file contents.

The target program gets control and its crt0.o initialization code runs. It must create stdin/stdout/stderr from scratch, without regard to what these may have been in the memory space before the execv (which no longer exists)

So, by the time the main function of the target program gets control, stdout is attached to the pipe.

It has the same buffering issue. In fact, the printf(" ") done before would have no effect on the state of the target's stdout stream.

The target would never see stdout (FD 1) as a TTY. It has already been set up as a pipe (so it gets standard buffering). The only remedy would be to use setlinebuf or do periodic fflush calls.

If that program just outputs data to stdout and then terminates, the stdout stream (if the target even uses stdio at all) is flushed on exit.

Otherwise, the target program has the same issues as the original program.

The programmer is expected to know they are dealing with a pipe (because they set it up) and handle the buffering accordingly.