bashshellcsv

Create CSV from specific columns in another CSV using shell scripting


I have a CSV file with several thousand lines, and I need to take some of the columns in that file to create another CSV file to use for import to a database.

I'm not in shape with shell scripting anymore, is there anyone who can help with pointing me in the correct direction?

I have a bash script to read the source file but when I try to print the columns I want to a new file it just doesn't work.

while IFS=, read symbol tr_ven tr_date sec_type sec_name name
do
    echo "$name,$name,$symbol" >> output.csv
done < test.csv

Above is the code I have. Out of the 6 columns in the original file, I want to build a CSV with "column6, column6, column1"

The test CSV file is like this:

Symbol,Trading Venue,Trading Date,Security Type,Security Name,Company Name
AAAIF,Grey Market,22/01/2015,Fund,,Alternative Investment Trust
AAALF,Grey Market,22/01/2015,Ordinary Shares,,Aareal Bank AG
AAARF,Grey Market,22/01/2015,Ordinary Shares,,Aluar Aluminio Argentino S.A.I.C.

What am I doing wrong with my script? Or, is there an easier - and faster - way of doing this?

Edit

These are the real headers:

Symbol,US Trading Venue,Trading Date,OTC Tier,Caveat Emptor,Security Type,Security Class,Security Name,REG_SHO,Rule_3210,Country of Domicile,Company Name

I'm trying to get the last column, which is number 12, but it always comes up empty.


Solution

  • The snippet looks and works fine to me, maybe you have some weird characters in the file or it is coming from a DOS environment (use dos2unix to "clean" it!). Also, you can make use of read -r to prevent strange behaviours with backslashes.

    But let's see how can awk solve this even faster:

    awk 'BEGIN{FS=OFS=","} {print $6,$6,$1}' test.csv >> output.csv
    

    Explanation