I am looking for a method to copy Excel data from an other file. I have a 'file_1.xlsx', and I would like to create a 'file_1_copy.xlsx' that does not exist. The copy has to be exactly the same that the original file.
Is there some method in Python to copy an Excel file completely?
I tried to write code that does this manually:
import openpyxl
from openpyxl.styles import Font
wb = openpyxl.load_workbook('test excel.xlsx')
ws = wb.active
new_wb = openpyxl.Workbook()
new_ws = new_wb.active
for row in ws.iter_rows():
for cell in row:
new_cell = new_ws.cell(row=cell.row, column=cell.col_idx, value=cell.value)
if cell.has_style:
new_cell.font = Font(name=cell.font.name, size=cell.font.size, bold=cell.font.bold, italic=cell.font.italic, color=cell.font.color)
new_wb.save('test copie excel.xlsx')
In the image above, on the left is the original file. On the right, the copy.
The issue is that the text in the first cell of the copy is supposed to be in 2 colors: red and blue, and highlighted in yellow. In the copy, I got only one color, all the text became red, and is not highlighted anymore.
Excel cells normally have only one font. If there is text in rich text format in Excel cells, then the cell content itself is rich text cell content. But rich text cell content is not read while load workbook file per default for performance reasons. So one must set rich_text=True
in load_workbook('test excel.xlsx', rich_text=True)
if needed.
The color is the fill color of the cell. To keep this, the fill must also be copied.
Btw: There is copy
from copy
which can be used to copy whole cell.font
and cell.fill
to avoid the need to copy each single font- and fill-property.
And the column widths are worksheet dimensions settings, which needs to be set for the new_ws
too if they shall be the same.
from openpyxl import load_workbook, Workbook
from copy import copy
wb = load_workbook('test excel.xlsx', rich_text=True)
ws = wb.active
new_wb = Workbook()
new_ws = new_wb.active
for row in ws.iter_rows():
for cell in row:
new_cell = new_ws.cell(row=cell.row, column=cell.col_idx, value=cell.value)
new_cell.font = copy(cell.font)
new_cell.fill = copy(cell.fill)
for column_name in ws.column_dimensions:
new_ws.column_dimensions[column_name].width = ws.column_dimensions[column_name].width
new_wb.save('test copie excel.xlsx')
That is for cell- and worksheet-properties you have now. if the need is copying all possible cell- and worksheet-properties, then the code gets very long and complex.
Above code works using openpyxl version 3.1.2. Version 3.1.3 seems to have bugs while copying rich text from shared-strings into inline-strings when text runs contain spaces only. Then the XML for the text should be <t xml:space="preserve"> </t>
but is <t> </t>
using Version 3.1.3. That leads to an error while opening the file in Excel.
But would not using another approach more convenient?
Open the test excel.xlsx
using wb = openpyxl.load_workbook('test excel.xlsx', rich_text=True)
. Then do the needed changing in wb
. After all use wb.save('test copie excel.xlsx')
to save the file under a new file name. No need for new_wb
and new_ws
at all.