cfilecachingtruncate

Does truncating a file affect the cached contents?


Let's say I write some amount to some file A, then I close and reopen the file using open("A", O_TRUNC | O_WRONLY);. After this, I write the same amount again.

Will this be slower than a program that doesn't truncate? In other words, will truncating also remove the cache lines of the content of the file, meaning that the subsequent write operations will always be a cache miss?


Update:

As advised I've measured the difference, using the following code:

// includes

#define M 1024*128 // 128 KB

int main(int argc, char *argv[]) {
    double start = get_seconds(); // based on gettimeofday()

    int out = open(argv[1], O_RDWR|O_CREAT|O_TRUNC, 0666);
    int k;
    char buf[M];

    for(k=0; k<M; k++) {
        buf[k] = '0';
    }

    write(out, buf, sizeof(buf));
    close(out);

    printf("%f\n", get_seconds() - start);
    return 0;
}

With a script, I executed this code, and an identical version without the O_TRUNC, 1000 times in rapid succession, with different input filenames for each version. I get the following averaged results:

Filesize Without O_TRUNC With O_TRUNC Speedup
32 KB 0.0000603 s 0.0001182 s 1.96X
64 KB 0.0001004 s 0.0001759 s 1.75X
128 KB 0.0001621 s 0.0002640 s 1.63X
256 KB 0.0002554 s 0.0004336 s 1.70X
512 KB 0.0004284 s 0.0007537 s 1.76X
1024 KB 0.0007371 s 0.0011899 s 1.61X

Thus, there is an observable difference.


Solution

  • Let's say I write some amount to some file A, then I close and reopen the file using open("A", O_TRUNC | O_WRONLY);. After this, I write the same amount again.

    Will this be slower than a program that doesn't truncate? In other words, will truncating also remove the cache lines of the content of the file, meaning that the subsequent write operations will always be a cache miss?

    Of course, that will be slower, doing a thing (even if it's the same thing) twice requires more time than doing it only once. The operation requires you to write the data twice, and the kernel must deallocate and reallocate the data in between. The kernel doesn't know that you are going to write the same data again, so it cannot predict and save the buffers for next use. The caching of disk blocks is indexed by block device and block number, so in case you need to access the same block you can get it from the cache instead of having to read it from disk. But if you deallocate the blocks, those don't belong anymore to the same file, and the kernel can reallocate them on the same or different file (most of the time they are reallocated immediately, as the kernel handles all blocks for every process in the same pool, it doesn't reserve that block for you because it was you who freed it.)

    The cache updates are for the other processes that use the same file see changes immediately as the changes in the file are made. But when you open the file the second time (with the O_TRUNC specifier) you make the kernel to free all the blocks that were in use by the first set of operations. These were allocated as you were writing for the first time on each write() call you performed. Then you closed the file, which did nothing else to the file.

    When you opened the file the second time, you specifier "I want a new, clean empty file to start writing again" So this enforced the kernel to free all the allocated blocks for the new file. All of those had to go to the freeblocks list and the indirect blocks too, if any, had to be freed. This is only on the data structures (this happens in memory only, so if you don't overpass the cache capacity, nothing is written on disk) The writing of blocks to disk doesn't affect running process, but to wait for free space to be available in the buffer cache. If you are using small files, this will not happen, but if you create a big file, close it, and create another big file, is quite improbable that you don't overrun the buffer cache, and you will have to wait for new buffer blocks to be available.

    Observing your measurements, I have tto say that there's no performance penalty because the data is the same or new. This difference in performance gets from the fact that you have trashed all file blocks on O_TRUNC to later allocate them again. Had you opened the file again and written, without truncation, the same blocks would have been overwritten in cache, because as you haven't truncated the file, you don't need to allocate disk blocks again, nor free them. File truncation comes into play to solve the problem of a file that needs to have new contents, and if you write an existing file, without truncation, you will write the file contents with new data, on the same blocks that were used (those same blocks will be still assigned to the file and most probably will be in the cache, making the deallocation/reallocation unnecessary)

    You can make another experiment, that will probably be in between the two measurements: Create a new, unexisten file (without doing anything to the old one) and write the same content. You will have the penalty of the allocation, but will not have the penalty of the freeing of blocks. And the numbers will probably be between the two columns you have shown.

    You can make another experiment I was asked about long time ago here. You can create a 10Gb file (full of random data) then make 150 links to it, and start rming them file by file and accounting the time taken by the rm command. You will see that 149 are almost immediate, but the last will take some time. Why, well, that last one requires the kernel to free all blocks used by the file, and that requires time. The last unlink(2) system call will take longer to execute. The freeing of disk blocks requires some times a big amount of resources to complete.

    Remember that system calls are executed by the process that requests them (but running in kernel mode, instead of user mode) and, in general (I don't know if somebody has taken the time to implement the freedom of blocks belonging to a file to a kernel thread yet or not)