cmacospipeposixnamed-pipes

Why is a FIFO pipe on macOS ~8x slower than an anonymous pipe?


On an M1 Max, I have created a FIFO named pipe with mkfifo and am testing write/read performance with a simple C program and pv. The program writes 65536 bytes at a time to stdout. When doing ./writer | pv > /dev/null, I get ~8 GiB/s. When doing ./writer >> mypipe and pv mypipe > /dev/null, I get ~1 GiB/s. For both of these, if I print the amount of writes performed, the 8x factor is about the same between the two. I've yet to test this on Linux and have not found any fcntl I can run on macOS/darwin that can change the buffer size of the FIFO pipe.

What I'd like to know is:

This is the C program:

#include <unistd.h>
#include <string.h>
#include <stdio.h>

int main() {
    const int size = 65536;
    char buf[size];
    memset(buf, 0, size);
    while (1) {
        if (write(1, buf, size) != size) {
            fprintf(stderr, "bad\n");
        }
    }
    return 0;
}

I've verified the most I can write before an anonymous pipe gets blocked is 65536 (M=0; while printf A; do >&2 printf "\r$((++M)) B"; done | sleep 999)


Solution

  • In macOS (14.4.1), fifo pipes are ultimately AF_UNIX sockets and their limits are dictated by https://github.com/apple-oss-distributions/xnu/blob/94d3b452840153a99b38a3a9659680b2a006908e/bsd/kern/uipc_proto.c#L84-L92 and https://github.com/apple-oss-distributions/xnu/blob/94d3b452840153a99b38a3a9659680b2a006908e/bsd/kern/uipc_usrreq.c#L920-L933 where at the time of this writing, the buffer size on receive and send of a streaming socket (SOCK_STREAM) is #define PIPSIZ 8192. Anonymous pipes are not the same concept and from testing, they appear to have a buffer size that is at least 65536, which is 8x PIPSIZ.

    From running dtrace on the writer side, it appears each write syscall does return with 65536, but I believe because the read buffer is constrained to 8192. So we will end up paying the cost of context switching on the reader side getting 8192 bytes from read in the kernel, returning to the user, and calling read again. So while the write side is in the kernel writing those 65536 bytes (which mind you, is saving us some context switching, so that's nice), the read side has no choice but to keep switching back and forth.

    In terms of how to increase this number or change some flags, I haven't been able to find out how since the soreceive method that uipc calls doesn't seem to accept any user set IO flags. As a workaround, you could introduce some forking with a pipe pair created in process to share between the reader and writer, but that may be too contrived.

    Notes: