If I want to concatenate two very large files residing on the same filesystem, say ext3 or ext4 for example, does linux provide an api to do it programmatically by reading and modifying the inode direct/indirect pointers of the two files, and modifying the filesize and superblock values? If so, is there any documentation on the api/headerfiles for that?
Note: I am aware of built in linux binaries like cat, tee, etc which could be used, but my question is about programmatically achieving this.
Yes, depending on what concatenate means, and how low level the code doing the work is, and what the file system is.
Low level, impractical, difficult, especially for ext3 & ext4. Suppose we wish to do the equivalent of cat foo bar | sponge foo
, but without anything but metadata being read or overwritten. In this case foo would have to be an exact blocksize multiple, and the trick would be to get the inodes and dir structure of both files, rm bar
, unmount the file system, and tweak the relevant inode however you please, (say dd
and some hex editor), in such a way as not to wreck anything else.
Depending on the file system that might be difficult, and require updating or modifying some other affected or obstructing data structure.
If foo is not an exact blocksize multiple there'd be garbage data in the middle of the concatenated file.
Cheat. Use a file system with in-line deduplication. Btrfs should have that feature someday.