I was wondering how to filter the following lines in AWK:
DSL -
1. Digital Simulation Language. Extensions to FORTRAN to simulate analog
computer functions. "DSL/90 - A Digital Simulation Program for Continuous
System Modelling", Proc SJCC 28, AFIPS (Spring 1966). Version: DSL/90 for
the IBM 7090. Sammet 1969, p.632.
FLIP -
1. Early assembly language on G-15. Listed in CACM 2(5):16 (May 1959).
2. "FLIP User's Manual", G. Kahn, TR 5, INRIA 1981.
3. Formal LIst Processor. Early language for pattern-matching on LISP
structures. Similar to CONVERT. "FLIP, A Format List Processor", W.
Teitelman, Memo MAC-M-263, MIT 1966.
So I can get something like this:
DSL
FLIP
I am using the following sentences in AWK:
BEGIN { RS = "\n\n\n" ; FS = " - " }
{ print $1 }
But what I get is just this:
DSL
Thanks in advance!
@JonathanLeffler gave you a good awk answer to your specific question but if you're going to be working on files with that format a lot, you may want to consider reformatting them to have records separated by newlines with each list item on a single line, e.g.:
$ cat file
DSL -
1. Digital Simulation Language. Extensions to FORTRAN to simulate analog
computer functions. "DSL/90 - A Digital Simulation Program for Continuous
System Modelling", Proc SJCC 28, AFIPS (Spring 1966). Version: DSL/90 for
the IBM 7090. Sammet 1969, p.632.
FLIP -
1. Early assembly language on G-15. Listed in CACM 2(5):16 (May 1959).
2. "FLIP User's Manual", G. Kahn, TR 5, INRIA 1981.
3. Formal LIst Processor. Early language for pattern-matching on LISP
structures. Similar to CONVERT. "FLIP, A Format List Processor", W.
Teitelman, Memo MAC-M-263, MIT 1966.
$ awk '!/^[[:space:]]*$/{printf "%s%s", (NF==2 && /-[[:space:]]*$/ ? rs rs : (/^ +[[:digit:]]+\./ ? rs : "")), $0; rs="\n"} END{print ""}' file
DSL -
1. Digital Simulation Language. Extensions to FORTRAN to simulate analogcomputer functions. "DSL/90 - A Digital Simulation Program for ContinuousSystem Modelling", Proc SJCC 28, AFIPS (Spring 1966). Version: DSL/90 forthe IBM 7090. Sammet 1969, p.632.
FLIP -
1. Early assembly language on G-15. Listed in CACM 2(5):16 (May 1959).
2. "FLIP User's Manual", G. Kahn, TR 5, INRIA 1981.
3. Formal LIst Processor. Early language for pattern-matching on LISPstructures. Similar to CONVERT. "FLIP, A Format List Processor", W.Teitelman, Memo MAC-M-263, MIT 1966.
That way you can process the output easily to print or do whatever else you want, e.g.
1) to print every header line plus first bullet item:
$ awk '...' file | awk 'BEGIN{RS=""; ORS="\n\n"; FS=OFS="\n"} {print $1,$2}'
DSL -
1. Digital Simulation Language. Extensions to FORTRAN to simulate analogcomputer functions. "DSL/90 - A Digital Simulation Program for ContinuousSystem Modelling", Proc SJCC 28, AFIPS (Spring 1966). Version: DSL/90 forthe IBM 7090. Sammet 1969, p.632.
FLIP -
1. Early assembly language on G-15. Listed in CACM 2(5):16 (May 1959).
2) to print the header line plus the second bullet item of just the "FLIP" record:
$ awk '...' file | awk 'BEGIN{RS=""; ORS="\n\n"; FS=OFS="\n"} /^FLIP -/{print $1,$3}'
FLIP -
2. "FLIP User's Manual", G. Kahn, TR 5, INRIA 1981.
3) to print the header line plus a count of the bullet items for that header:
$ awk '...' file | awk 'BEGIN{RS=""; FS=OFS="\n"} {print $1 NF-1}'
DSL - 1
FLIP - 3
etc., etc.