I am trying to automate an administrative (local pc) excel task via PowerShell; to open an Excel file, read/load its content/data, then query the content/data like a database for various, specific "database-query" like-scenarios (e.g. where colA=X and colD=Y then count/or do something useful then print output to terminal screen).
I can open an Excel file on my pc, read the sheets (e.g. tab names) and their column names (A1:Z1).
My problem is trying to query specific column header/s, for specific data types and scenarios and print in plain-text
From my understanding, there are effectively two primary methods of opening a file (e.g. myExcel.xlsx) via PowerShell:
Using the modern Object Oriented approach of the COM object
$file_obj_handler = New-Object -ComObject Excel.Application
Using the old-school, less efficient approach of import-csv
$file_obj_handler = import-csv filepathname <extra parameters here>
Now, FYI I am not opposed to either approach and I have tried both. But my common issue is, whilst the relevant array is created, its content is encoded in some weird symbolic format (looks like Unicode/XML - rather than human plain text).
I checked the array size and type using the commands:
$file_obj_handler.Count #my array size was 4194
$file_obj_handler.GetType() #printed System.Object[]
I have tried using the in-line command parameter of (-encoding utf8) and such to no avail.
Is there a better way, ideally. to convert the array to plain-text when opening the object file handler via either method (1 or 2), rather than trying to circumvent/reformat the array after the excel file data has already been read/loaded?
I appreciate I may need to use a delimiter for the array elements (stage 2) but my primary ask (stage 1) is to first be able to read (string content in human format).
I do use the open file refs:
$workbook=$file_obj_handler.Workbooks.Open($excelFile)
But, the same problem occurs whichever way I loop through/print array contents - it's not printing in plain-text/human readable:
$workbook.Worksheets.Item(1)
$worksheets.Cells.Items(1,$i).Text
Find below how you can access a range in a excel sheet with Powershell
# Define path to Excel file
$excelFile = "d:\tmp\filename.xlsx"
# Create Excel COM object
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
$excel.DisplayAlerts = $false
$workbook = $excel.Workbooks.Open($excelFile)
$worksheet = $workbook.Worksheets.Item(1)
# Define range
$range = $worksheet.Range("A1:B9")
# Read range values into an array (this is a 2D array)
$rangeValues = $range.Value2
# Now you can loop through the array
for ($row = 1; $row -le $rangeValues.GetLength(0); $row++) {
for ($col = 1; $col -le $rangeValues.GetLength(1); $col++) {
Write-Host -NoNewline "$($rangeValues[$row, $col])`t"
}
Write-Host
}
$workbook.Close($false)
$excel.Quit()
Another approach could be to use ADO
# Define path to Excel file
$excelFile = "d:\tmp\filename.xlsx"
# Connection String for .xlsx (there is another for.xls )
$connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=$excelFile;Extended Properties='Excel 12.0 Xml;HDR=YES;IMEX=1';"
# SQL-command to read data from Sheet1 and filter the column colName
$query = "SELECT * FROM [Sheet1$] WHERE [ColName]='Test'"
# ADO Connection
$connection = New-Object -ComObject ADODB.Connection
$recordset = New-Object -ComObject ADODB.Recordset
$connection.Open($connectionString)
$recordset.Open($query, $connection)
# Print Data
while (-not $recordset.EOF) {
$col1 = $recordset.Fields.Item(0).Value
$col2 = $recordset.Fields.Item(1).Value
Write-Host "$col1`t$col2"
$recordset.MoveNext()
}
# Closing
$recordset.Close()
$connection.Close()