csvtext-processingformat-conversion

Converting from field: value format to CSV


I have a file in the following format (well, sort of):

RECORD_SEPARATOR
foo: some foo value
bar: another value
baz: 123
RECORD_SEPARATOR
foo: another foo value
bar: yet another value
baz: 345
RECORD_SEPARATOR
foo: a third foo
RECORD_SEPARATOR
bar: a fourth bar
baz: 111

and so on. The key point here is that not all records have all fields present.

My question: What's a super-simple way to convert this data into CSV format? That is, in my example

foo,bar,baz
some foo value,another value,123
another foo value,yet another value,345
a third foo,,
,a fourth bar,111

Of course you can write a awk (or perl, or Python) script for this, but I'm hoping there's something pre-existing, or some trick to make it a very short script.

Note: I'm looking for something that's Unix-command-line-oriented of course.


Solution

  • Hi with the great Miller http://johnkerl.org/miller/doc, starting from

    foo: some foo value
    bar: another value
    baz: 123
    
    foo: another foo value
    bar: yet another value
    baz: 345
    
    foo: a third foo
    
    bar: a fourth bar
    baz: 111
    

    you can run

    mlr --x2p --ips ": " --barred cat then unsparsify --fill-with "" inputFile
    

    and have this pretty print output

    +-------------------+-------------------+-----+
    | foo               | bar               | baz |
    +-------------------+-------------------+-----+
    | some foo value    | another value     | 123 |
    | another foo value | yet another value | 345 |
    | a third foo       | -                 | -   |
    | -                 | a fourth bar      | 111 |
    +-------------------+-------------------+-----+
    

    If you want a CSV, run

    mlr --x2c --ips ": " cat then unsparsify --fill-with "" inputFile
    

    and you will have

    foo,bar,baz
    some foo value,another value,123
    another foo value,yet another value,345
    a third foo,,
    ,a fourth bar,111