linuxawksedfile-processing

How to remove certain lines of a large file (>5G) using linux commands


I have files which are very large (> 5G), and I want to remove some lines by the line numbers without moving (copy and paste) files.

I know this command works for a small size file. (my sed command do not recognize -i option)

sed "${line}d" file.txt > file.tmp && mv file.tmp file.txt

This command takes relatively long time because of the size. I just need to remove the first line and the last line, but also want to know how to remove line number n, for example.


Solution

  • Because of the way files are stored on standard filesystems (NTFS, EXTFS, ...), you cannot remove parts of a file in-place.

    The only thing you can do in-place is

    Other operations must use a temporary file, or temporary memory to read the file fully and write it back modified.

    EDIT: you can also "shrink" a file as read here using a C program (Linux or Windows would work) so that means that you could remove the last line (but still not the first line or any line in between)