clinuxfile

How can we know there is hole in a file in C?


from the man page :

lseek() allows the file offset to be set beyond the end of the file (but this does not change the size of the file). If data is later written at this point, subsequent reads of the data in the gap (a "hole") return null bytes ('\0') until data is actually written into the gap.

we read null bytes in the gap, but what if the data in file itself is '\0' ? how can i distinguish the null bytes is gap or '\0' data in file?


Solution

  • "hole" is maybe a misnomer - there's really no literal hole in the file. Let's a assume a "hole" to be an area in a sparse file for which storage has not yet been allocated. There is no standard way (yet) to detect such un-allocated storage in files. For all intents and purposes, the "holes" are indeed part of the file, containing just a lot of zero bytes - These are just not written on disk.

    On some systems, the lseek system call may support seek modes named SEEK_HOLE and SEEK_DATA. This is a nonstandard extension that is present in some versions of Linux, BSD, and Solaris.

    These position the file pointer to the next hole when it is currently in allocated space (for SEEK_HOLE) or the next "real" data position when in a hole (for SEEK_DATA).

    Other systems might support an ioctl named FIEMAP to map logical to physical file positions (which also allows you to detect non-allocated storage).