csvawkfieldtext-parsingquoting

Can awk deal with CSV file that contains comma inside a quoted field?


I am using awk to perform counting the sum of one column in the csv file. The data format is something like:

id, name, value
1, foo, 17
2, bar, 76
3, "I am the, question", 99

I was using this awk script to count the sum:

awk -F, '{sum+=$3} END {print sum}'

Some of the value in name field contains comma and this break my awk script. My question is: can awk solve this problem? If yes, and how can I do that?

Thank you.


Solution

  • you write a function in awk like below:

    $ awk 'func isnum(x){return(x==x+0)}BEGIN{print isnum("hello"),isnum("-42")}'
    0 1
    

    you can incorporate in your script this function and check whether the third field is numeric or not.if not numeric then go for the 4th field and if the 4th field inturn is not numberic go for 5th ...till you reach a numeric value.probably a loop will help here, and add it to the sum.