I am facing some problems reading excel files created with the Axlsx module via RubyXL. The specific code is this
require 'axlsx'
require 'RubyXL'
#Create the Excel doc with Axlsx with two sheets named My Sheet1 and My Sheet 2
Axlsx::Package.new do |p|
p.workbook.add_worksheet(:name => "My Sheet1") do |sheet|
sheet.add_row ["Simple Pie Chart"]
%w(first second third).each { |label| sheet.add_row [label, rand(24)+1] }
end
p.workbook.add_worksheet(:name => "My Sheet 2") do |sheet|
sheet.add_row ["Simple Pie Chart"]
%w(first second third).each { |label| sheet.add_row [label, rand(24)+1] }
end
p.serialize('simple.xlsx')
end
#Parse the above created Excel via RubyXL and print the names of the two sheets. This comes up blank
workbook = RubyXL::Parser.parse("simple.xlsx")
workbook.worksheets.each do |worksheet|
puts "Worksheet is #{worksheet.sheet_name}"
end
The problem is that the worksheet names come up blank. If I after having created with axlsx open up the created Excel with MS Excel and Save it, then the RubyXL parser is able to read the sheet names correctly. It seems like MS Excel is fixing the document somehow.
Am I missing something when creating the Excel via Axlsx?
I am using axlsx version 1.3.6 and rubyXL 1.2.10.
I also tried adding p.use_shared_strings = true before serializing with axlsx.
So instead of this in parser.rb of rubyXL.
#sheet_names, dimensions
def Parser.create_matrix(wb,i, files)
sheet_names = files['app'].css('TitlesOfParts vt|vector vt|lpstr').children
sheet = Worksheet.new(wb,sheet_names[i].to_s,[])
I replaced it with
#sheet_names, dimensions
def Parser.create_matrix(wb,i, files)
sheet_names = []
files['workbook'].css('sheet').each do |sheet|
sheet_names.push sheet.attribute('name').to_s
end
sheet = Worksheet.new(wb,sheet_names[i].to_s,[])
Is that the right fix?
If you look at the XML files of a axlsx
document versus the XML files of a standard Excel workbook you'll see that they a bit different. I think RubyXL
is not parsing the workbook properly since the workbook is missing some nodes/attributes that a normal Excel file would have:
workbook.xml (normal Excel version)
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
<fileVersion appName="xl" lastEdited="5" lowestEdited="5" rupBuild="9303"/>
<workbookPr defaultThemeVersion="124226"/>
<bookViews>
<workbookView xWindow="870" yWindow="840" windowWidth="22455" windowHeight="9630"/>
</bookViews>
<sheets>
<sheet name="My Sheet1" sheetId="1" r:id="rId1"/>
<sheet name="My Sheet 2" sheetId="2" r:id="rId2"/>
</sheets>
<calcPr calcId="0"/>
</workbook>
workbook.xml (axlsx version)
<?xml version="1.0" encoding="UTF-8"?>
<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
<workbookPr date1904="false"/>
<sheets>
<sheet name="My Sheet1" sheetId="1" r:id="rId1"/>
<sheet name="My Sheet 2" sheetId="2" r:id="rId2"/>
</sheets>
</workbook>
If you parse the document with the Ruby standard library win32ole
, it gives the correct worksheet names, so this must be an issue with the way RubyXL is parsing the document.
require 'win32ole'
ex = WIN32OLE.new("Excel.Application")
wb = ex.workbooks.open("simple.xlsx")
wb.worksheets.each do |ws|
puts ws.name
end
Output:
My Sheet1
My Sheet 2
=> nil