rubyrubyxl

Parsing a specific section of an Excel Spreadsheet with RubyXL


I'm trying to get a table, for example A1:A9 to G1:G9.

My problem is, there are other elements in the sheet that I don't want to iterate through.

workbook = RubyXL::Parser.parse("excelFIle.xlsm")
worksheet = workbook[0]
data = worksheet.extract_data
# => [ [ x, x, x, x, x, x, x, x, x, y, y, y, y, y, y, y, y, y, y ],
#      [ x, x, x, x, x, x, x, x, x, y, y, y, y, y, y, y, y, y, y ],
#      ...
#    ]

Is there a way to only parse the "x" portion A1:A9 to G1:G9, or do I need to cut it manually from data?


Solution

  • The most straightforward way to do what you're asking for is to just slice each row:

    data[0..8].map {|row| row[0..6] }
    

    If you want to calculate the row and column ranges dynamically based on cell references (A1 et al), you have to do a little more legwork (this is untested, but you get the idea):

    top_left_ref = 'A1'
    bottom_right_ref = 'G9'
    
    # Convert the cell references to row and column indexes
    row_start, col_start = RubyXL::Reference.ref2ind(top_left_ref)     # => [ 0, 0 ]
    row_end,   col_end   = RubyXL::Reference.ref2ind(bottom_right_ref) # => [ 8, 6 ]
    
    row_range = row_start..row_end # => 0..8
    col_range = col_start..col_end # => 0..6
    
    puts data[row_range].map {|row| row[col_range] }
    

    You could, of course, turn those last three lines into a one-liner.

    Update:

    Looking more closely at the RubyXL::Reference documentation, it looks like we can pass the cell references directly to Reference#initialize, which basically does exactly what the first half-dozen lines above do:

    ref = RubyXL::Reference.new('A1:G9')
    puts data[ref.row_range].map {|row| row[ref.col_range] }
    

    So that's neat.