bashcsvawk

How to add a field to a csv after deriving its value in a read line


I'm attempting to process in a bash script using awk/cut/sed a simple CSV that looks like this

id,version
84544,abcd v2.1.0-something
3439,abcd a82f1a
3,abcd 2.2.1-bar

Where

  1. abcd is constant
  2. the string after the space may include a single letter "v" prefix
  3. the string after the space may have a build suffix starting with a dash, and
  4. the string may instead be a 6 character hex string.

I'm trying to add a field to an output CSV breaking down the second field into only the stripped version number or the hex string.

I've figured out how to parse each line and strip the data I'd like using this

while read line; do
    awk -F '[, ]' '{print $3}' | sed 's/^v//' | cut -d- -f-1;
done < input.csv > output.csv

I'm having trouble figuring out how to construct the output into a CSV like this

id,version,simple_version
84544,abcd v2.1.0-something,2.1.0
3439,abcd a82f1a,a82f1a
3,abcd 2.2.1-bar,2.2.1

Any guidance on constructing the output?


Solution

  • Assumptions:

    NOTES:

    One awk idea that replaces all of OP's current code:

    awk '-F[, ]' '
    { split($3,a,"-")                                  # split 3rd field on "-" delimiter
      sub(/^v/,"",a[1])                                # strip a leading "v" from first split field
      print $0 "," (NR==1 ? "simple_version" : a[1])   # conditionally print the 3rd field
    }
    ' input.csv > output.csv
    
    #######
    # or as a one-liner:
    
    awk '-F[, ]' '{split($3,a,"-"); sub(/^v/,"",a[1]);print $0 "," (NR==1 ? "simple_version" : a[1])}' input.csv > output.csv
    

    An alternative that eliminates the split() by expanding the FS definition:

    awk '-F[, -]' '
    { f3=$3
      sub(/^v/,"",f3)
      print $0 "," (NR==1 ? "simple_version" : f3)
    }
    ' input.csv > output.csv
    
    #######
    # or as a one-liner:
    
    awk '-F[, -]' '{f3=$3; sub(/^v/,"",f3); print $0 "," (NR==1 ? "simple_version" : f3)}' input.csv > output.csv
    

    Both of these generate:

    $ cat output.csv
    id,version,simple_version
    84544,abcd v2.1.0-something,2.1.0
    3439,abcd a82f1a,a82f1a
    3,abcd 2.2.1-bar,2.2.1
    

    NOTES: