I have a folder, with subfolders full of csv files. For each file I would like to get only the third column and save it in another file.
Example:
filedata1.csv
aa,bb,cc
cat,horse,"dog, bird"
4th,33,first and second
filedata1column3.csv
cc
dog, bird
first and second
what I tried but it turns out wrong:
param ( [String] $Filename = '*.csv',
[String] $SourceDir = 'C:\Data\',
)
ForEach ($csvFile in Get-ChildItem -Path $SourceDir -Filter $Filename){
(Import-Csv $csvFile)
cut -d, -f 3 *.csv > *column3.csv
Use the intrinsic psobject
property to determine each file's column names from the objects that Import-Csv
parses the CSV data into, and extract the third name.
Then use member-access enumeration to extract all property values from the rows (objects), relying on PowerShell's ability to use expressions (variable references) as property names.
Use Join-Path
in conjunction with -f
, the format operator to determine a suitable output file path and name, based on the input file name.
# ... parameter declaration and foreach loop omitted.
# The following is what needs to go *inside the foreach loop*.
# Import all rows.
$rows = Import-Csv $csvFile
# Determine the name of the third column
$thirdColName = ($rows[0].psobject.Properties.Name)[2]
# Determine a suitable output file path and name.
# Here, the output file is placed in the same directory as the input file,
# with '_colummn3' appended to the latter's base name.
$outFile =
Join-Path $csvFile.DirectoryName ('{0}_column3.csv' -f $csvFile.BaseName)
# Output the header column
$thirdColName > $outFile
# Append the values of the third column
$rows.$thirdColName >> $outFile
Note:
In Windows PowerShell, >
and >>
create "Unicode" (UTF-16LE) files, whereas in PowerShell (Core) 7+ they create BOM-less UTF-8 files.
If you need a different encoding, use Set-Content
and Add-Content
with an -Encoding
argument, but note that in Windows PowerShell you cannot create BOM-less UTF-8 files that way.
If you do need BOM-less UTF-8 files in Windows PowerShell, use the workaround from this answer (use this instead of the >
and >>
statements, and use "`r`n"
instead of "`n"
if you need Windows-format CRLF newlines):
# Write the entire file as BOM-less UTF-8.
$null =
New-Item -Force $outFile -Value (
$thirdColName + "`n" + ($rows.$thirdColName -join "`n") + "`n"
)
To put it all together in a solution that:
Place the code in a .ps1
file and invoke the latter (e.g. ./Script.ps1
), optionally with arguments that override the defaults.
param(
$FileName = '*.csv', # File-name filter
$SourceDir = $PWD, # Input-files dir., default to current
$OutDir = $PWD, # Output-files dir., default to current
$ColumnIndex = 3, # 1-based index of the col. of interest
# Suffix to append to the input file name's base name
# to form the output file name.
$OutFileBaseNameSuffix = "_column$ColumnIndex"
)
foreach ($csvFile in Get-ChildItem -LiteralPath $SourceDir -Filter $Filename) {
# If the output files are written to the same dir. as the input files,
# exclude any output files from a previous run.
if ($SourceDir -eq $OutDir -and $csvFile.BaseName -like "*$OutFileBaseNameSuffix") { continue }
# Import all rows for the file at hand.
$rows = Import-Csv $csvFile
# Determine the name of the column with the index of interest.
# Subtract -1 from $ColumnIndex, because array indices are *0*-based.
$colName = ($rows[0].psobject.Properties.Name)[$ColumnIndex-1]
# Determine the output file path and name.
$outFile =
Join-Path $OutDir ('{0}{1}.csv' -f $csvFile.BaseName, $OutFileBaseNameSuffix)
# Write the entire file, as BOM-less UTF-8.
$null =
New-Item -Force $outFile -Value (
$colName + "`n" + ($rows.$colName -join "`n") + "`n"
)
}