linuxunixcsvsplit

How to split CSV files as per number of rows specified?


I've CSV file (around 10,000 rows ; each row having 300 columns) stored on LINUX server. I want to break this CSV file into 500 CSV files of 20 records each. (Each having same CSV header as present in original CSV)

Is there any linux command to help this conversion?


Solution

  • Made it into a function. You can now call splitCsv <Filename> [chunkSize]

    splitCsv() {
        HEADER=$(head -1 $1)
        if [ -n "$2" ]; then
            CHUNK=$2
        else 
            CHUNK=1000
        fi
        tail -n +2 $1 | split -l $CHUNK - $1_split_
        for i in $1_split_*; do
            sed -i -e "1i$HEADER" "$i"
        done
    }
    

    Found on: http://edmondscommerce.github.io/linux/linux-split-file-eg-csv-and-keep-header-row.html