bashawkcsplit

Split file by extracting lines between two keywords


I have a file with the following lines:

string
string
string
MODEL 1
.
.
.
TER
string 
string
string
MODEL 2
.
.
.
TER

where there are 5000 such MODELs. I want to split this file such that each section beginning MODEL X and ending TER (shown with dots) is saved to its own file, and everything else is discarded. How can I do this? Possibly with awk or split?

I have checked a couple of other similar questions, but failed to apply the answers to my case.

Also note that I use Mac OS X.


Solution

  • You can use this awk for this:

    awk '/^MODEL/{file="model" $2} file{print > file} /^TER/{close(file); file=""}' file
    

    How it works:

    /^MODEL/               # match lines starting with MODEL
    file="model" $2        # make variable file as model + model_no from column 2
    file{...}              # execute of file variable is set
    {print>file}           # print each record to file
    /^TER/                 # match lines starting with TER
    {close(file); file=""} # close file and reset file to ""
    

    Then verify as:

    cat model1
    MODEL 1
    .
    .
    .
    TER
    
    cat model2
    MODEL 2
    .
    .
    .
    TER