powershellcsvcut

Get 3rd column and put it in another file for all CSV files with Powershell


I have a folder, with subfolders full of csv files. For each file I would like to get only the third column and save it in another file.

Example:

filedata1.csv

aa,bb,cc
cat,horse,"dog, bird"
4th,33,first and second

filedata1column3.csv

cc
dog, bird
first and second

what I tried but it turns out wrong:

param ( [String] $Filename  = '*.csv',
            [String] $SourceDir = 'C:\Data\',
    )
    
ForEach ($csvFile in Get-ChildItem -Path $SourceDir -Filter $Filename){
            (Import-Csv $csvFile)
            cut -d, -f 3 *.csv > *column3.csv

Solution

  • # ... parameter declaration and foreach loop omitted.
    # The following is what needs to go *inside the foreach loop*.
    
    # Import all rows.
    $rows = Import-Csv $csvFile
    # Determine the name of the third column
    $thirdColName = ($rows[0].psobject.Properties.Name)[2]
    
    # Determine a suitable output file path and name.
    # Here, the output file is placed in the same directory as the input file, 
    # with '_colummn3' appended to the latter's base name.
    $outFile =
      Join-Path $csvFile.DirectoryName ('{0}_column3.csv' -f $csvFile.BaseName)
    
    # Output the header column
    $thirdColName > $outFile
    # Append the values of the third column
    $rows.$thirdColName >> $outFile
    

    Note:


    To put it all together in a solution that:

    Place the code in a .ps1 file and invoke the latter (e.g. ./Script.ps1), optionally with arguments that override the defaults.

    param(
      $FileName = '*.csv', # File-name filter
      $SourceDir = $PWD,   # Input-files dir., default to current
      $OutDir = $PWD,      # Output-files dir., default to current
      $ColumnIndex = 3,    # 1-based index of the col. of interest
      # Suffix to append to the input file name's base name 
      # to form the output file name.
      $OutFileBaseNameSuffix = "_column$ColumnIndex"
    )
    
    foreach ($csvFile in Get-ChildItem -LiteralPath $SourceDir -Filter $Filename) {
    
      # If the output files are written to the same dir. as the input files,
      # exclude any output files from a previous run.
      if ($SourceDir -eq $OutDir -and $csvFile.BaseName -like "*$OutFileBaseNameSuffix") { continue }
    
      # Import all rows for the file at hand.
      $rows = Import-Csv $csvFile
      # Determine the name of the column with the index of interest.
      # Subtract -1 from $ColumnIndex, because array indices are *0*-based.
      $colName = ($rows[0].psobject.Properties.Name)[$ColumnIndex-1]
    
      # Determine the output file path and name.
      $outFile =
        Join-Path $OutDir ('{0}{1}.csv' -f $csvFile.BaseName, $OutFileBaseNameSuffix)
    
      # Write the entire file, as BOM-less UTF-8.
      $null =
        New-Item -Force $outFile -Value (
          $colName + "`n" + ($rows.$colName -join "`n") + "`n"
        )
    
    }