excelpowershell

Open an Excel File, Read Specific Contents, Print the Output


I am trying to automate an administrative (local pc) excel task via PowerShell; to open an Excel file, read/load its content/data, then query the content/data like a database for various, specific "database-query" like-scenarios (e.g. where colA=X and colD=Y then count/or do something useful then print output to terminal screen).

I can open an Excel file on my pc, read the sheets (e.g. tab names) and their column names (A1:Z1).

My problem is trying to query specific column header/s, for specific data types and scenarios and print in plain-text

From my understanding, there are effectively two primary methods of opening a file (e.g. myExcel.xlsx) via PowerShell:

  1. Using the modern Object Oriented approach of the COM object

    $file_obj_handler = New-Object -ComObject Excel.Application

  2. Using the old-school, less efficient approach of import-csv

    $file_obj_handler = import-csv filepathname <extra parameters here>

Now, FYI I am not opposed to either approach and I have tried both. But my common issue is, whilst the relevant array is created, its content is encoded in some weird symbolic format (looks like Unicode/XML - rather than human plain text).

I checked the array size and type using the commands:

$file_obj_handler.Count #my array size was 4194
$file_obj_handler.GetType() #printed System.Object[]

I have tried using the in-line command parameter of (-encoding utf8) and such to no avail.

Is there a better way, ideally. to convert the array to plain-text when opening the object file handler via either method (1 or 2), rather than trying to circumvent/reformat the array after the excel file data has already been read/loaded?

I appreciate I may need to use a delimiter for the array elements (stage 2) but my primary ask (stage 1) is to first be able to read (string content in human format).

I do use the open file refs:

$workbook=$file_obj_handler.Workbooks.Open($excelFile)

But, the same problem occurs whichever way I loop through/print array contents - it's not printing in plain-text/human readable:

$workbook.Worksheets.Item(1)
$worksheets.Cells.Items(1,$i).Text

Solution

  • Find below how you can access a range in a excel sheet with Powershell

    # Define path to Excel file
    $excelFile = "d:\tmp\filename.xlsx"
    
    # Create Excel COM object
    $excel = New-Object -ComObject Excel.Application
    $excel.Visible = $false
    $excel.DisplayAlerts = $false
    
        $workbook = $excel.Workbooks.Open($excelFile)
        $worksheet = $workbook.Worksheets.Item(1)
    
        # Define range
        $range = $worksheet.Range("A1:B9")
    
        # Read range values into an array (this is a 2D array)
        $rangeValues = $range.Value2
    
        # Now you can loop through the array
        for ($row = 1; $row -le $rangeValues.GetLength(0); $row++) {
            for ($col = 1; $col -le $rangeValues.GetLength(1); $col++) {
                Write-Host -NoNewline "$($rangeValues[$row, $col])`t"
            }
            Write-Host
        }
    
        $workbook.Close($false)
    
    $excel.Quit()
    

    Another approach could be to use ADO

    # Define path to Excel file
    $excelFile = "d:\tmp\filename.xlsx"
    
    # Connection String for .xlsx (there is another for.xls )
    $connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=$excelFile;Extended Properties='Excel 12.0 Xml;HDR=YES;IMEX=1';"
    
    # SQL-command to read data from Sheet1 and filter the column colName
    $query = "SELECT * FROM [Sheet1$] WHERE [ColName]='Test'"
    
    # ADO Connection
    $connection = New-Object -ComObject ADODB.Connection
    $recordset = New-Object -ComObject ADODB.Recordset
    
    $connection.Open($connectionString)
    $recordset.Open($query, $connection)
    
    # Print Data
    while (-not $recordset.EOF) {
        $col1 = $recordset.Fields.Item(0).Value
        $col2 = $recordset.Fields.Item(1).Value
        Write-Host "$col1`t$col2"
        $recordset.MoveNext()
    }
    
    # Closing
    $recordset.Close()
    $connection.Close()