I have a multi-part question about Linux's read(2)/write(2) system calls:
1.Where exactly is the copy behavior, as described in the title, stated?
I've tried looking through the Linux Manual Page(2) but didn't find this explicitly stated. Yet, many discussions claim that the man page "clearly states" this behavior.
2.During the read/write(2) process, does the copy that takes place in the kernel space actually copy entries content of the page table?
Textbooks on operating systems mention that memory management is done via page tables, which map memory to the file system. The page table is obviously a kernel-space object. These two concepts are often not linked: When discussing interfaces, people say 'read/write(2) involves a kernel-space copy,' and when discussing operating systems, they say 'memory is managed using page tables.'
3.As mentioned in the title, given that I have not found explicit information on this in textbooks on operating systems, I am curious: Is the kernel-space copy during the read/write process a standard design, or is this something unique to Linux?
It is described in the manual for "sendfile". For read
and write
, copying is an implementation detail - the programmer doesn't necessary have to know it. For sendfile
, it is part of rationale - it explains, what makes it different from the already existing calls.
In general, read
/write
can be implemented in one of three ways:
read
- the kernel pre-reads in advance, and the read
syscall just copies the data.The manual allows for options 1 and 2, but in practice, 1 is the most used, as it is usually faster. On Windows, WriteFile/ReadFile also allows for option 3, but only if the program specifically requests it.
Page tables do not map anything to the filesystem - they map virtual addresses of the current process and the kernel to the physical addresses of RAM or device MMIO registers. The write
might trigger a memory allocation of a file buffer, but otherwise, the file operations have nothing to do with page tables.
There is another system call - mmap
. That system call modifies the page tables in such a way, that the kernel file buffer appears in your process memory. That way, you can directly modify the kernel-side buffers with normal memory reads and writes. The kernel then can order the disk controller to store them on the disk.