I'm attempting to process in a bash script using awk/cut/sed a simple CSV that looks like this
id,version
84544,abcd v2.1.0-something
3439,abcd a82f1a
3,abcd 2.2.1-bar
Where
I'm trying to add a field to an output CSV breaking down the second field into only the stripped version number or the hex string.
I've figured out how to parse each line and strip the data I'd like using this
while read line; do
awk -F '[, ]' '{print $3}' | sed 's/^v//' | cut -d- -f-1;
done < input.csv > output.csv
I'm having trouble figuring out how to construct the output into a CSV like this
id,version,simple_version
84544,abcd v2.1.0-something,2.1.0
3439,abcd a82f1a,a82f1a
3,abcd 2.2.1-bar,2.2.1
Any guidance on constructing the output?
Assumptions:
abcd
has no bearing on which/how rows are processed (otherwise OP will need to update the question with additional details and sample data)NOTES:
awk
into the mix there's usually no need for sed
and cut
awk
is designed for line-by-line processing there's typically no need for a (bash
) while
loopOne awk
idea that replaces all of OP's current code:
awk '-F[, ]' '
{ split($3,a,"-") # split 3rd field on "-" delimiter
sub(/^v/,"",a[1]) # strip a leading "v" from first split field
print $0 "," (NR==1 ? "simple_version" : a[1]) # conditionally print the 3rd field
}
' input.csv > output.csv
#######
# or as a one-liner:
awk '-F[, ]' '{split($3,a,"-"); sub(/^v/,"",a[1]);print $0 "," (NR==1 ? "simple_version" : a[1])}' input.csv > output.csv
An alternative that eliminates the split()
by expanding the FS
definition:
awk '-F[, -]' '
{ f3=$3
sub(/^v/,"",f3)
print $0 "," (NR==1 ? "simple_version" : f3)
}
' input.csv > output.csv
#######
# or as a one-liner:
awk '-F[, -]' '{f3=$3; sub(/^v/,"",f3); print $0 "," (NR==1 ? "simple_version" : f3)}' input.csv > output.csv
Both of these generate:
$ cat output.csv
id,version,simple_version
84544,abcd v2.1.0-something,2.1.0
3439,abcd a82f1a,a82f1a
3,abcd 2.2.1-bar,2.2.1
NOTES: