bashcsvawkdata-filtering

How to remove rows from a CSV with no data using AWK


I am working with a large csv in a linux shell that I narrowed down to 3 columns:
Species name, Latitude, and Longitude.

awk -F "\t" '{print $10,","$22,",",$23}' occurance.csv > three_col.csv


The file ends up looking like this:

      species         | Lat     | Long   |
----------------------|---------|---------
Leucoraja erinacea    | 41.0748 | 72.9461|
Brevoortia tyrannus   | 39.0748 | 70.9461|
Paralichthys dentatus |         | 73.2354|
Paralichthys dentatus |         |        |
Leucoraja erinacea    | 41.0748 |        |
Brevoortia tyrannus   |         |        |
Brevoortia tyrannus   |         |        |
Paralichthys dentatus | 39.0748 | 70.9461|
Brevoortia tyrannus   | 39.0748 | 70.9461|

However this is what I want it to Look: Notice all species with no lat or long data have been removed

      species         | Lat     | Long   |
----------------------|---------|---------
Leucoraja erinacea    | 41.0748 | 72.9461|
Brevoortia tyrannus   | 39.0748 | 70.9461|
Paralichthys dentatus | 39.0748 | 70.9461|
Brevoortia tyrannus   | 39.0748 | 70.9461|

I've been trying to remove rows that are lacking either Lat or Long data. Using a line like this:

awk -F "\t" BEGIN '{print $1,$2,$3}' END '{$2!=" " && $3!= " " }' three_col.csv > del_blanks.csv

but it results in this error even with small changes that I make trying to solve the problem

awk: line 1: syntax error at or near end of line

How can I get rid of these rows with missing data, is this something I need a "for" loop for?


Solution

  • Since I don't know what your occurance.csv file looks like, this is a shot in the dark:

    awk -F "\t" '$22 && $23 {print $10,","$22,",",$23}' occurance.csv > three_col.csv
    

    The expression $22 && $23 says: Both field 22 and field 23 must not be blank. It is a condition to filter out those lines which don't qualify. It is a shorthand for $22 != "" && $3 != "".