pythonexcelautomationcopy-pastexlwings

How do I copy-paste a worksheet from one excel workbook to another using python?


In Excel, I can manually:

and I have an identical copy of the sheet from one book to the other, down to formatting, colour, borders, formulae, dimensions, the works.

I need to automate this process over thousands of target workbooks.

I've tried using openpyxl, but all the solutions I've found explicitly copy individual cells with just a few properties instead of the whole sheet. For example, column widths and row heights are missed by most implementations. Also, Excel features like collapse rows...

Are there python libraries, (or libraries for other languages really) that make this possible?


Solution

  • This code sample uses Xlwings module to basically duplicate the steps you would perform in Excel to copy/paste sheets to a new workbook as you have detailed in your Post.
    Xlwings as it notes, is like Excel VBA in Python. Given that it does open the Excel application to perform the steps Xlwings may be perceived as slow for this reason. However in comparison to other methods is possibly ultimately faster.
    Using Openpyxl would require a lot of cell by cell copying of values and style data. Its speed would depend on the amount of data in the Sheets, the more data and more style settings the more cells and setting for the cell to copy (Note the Openpyxl api 'copy worksheet' works only within the same workbook).

    The original workbook, 'source_wb', retains the original sheet(s) as that workbook is not changed (or saved) in this code sample.
    You can of course remove the sheets from the source workbook as they are copied, and save the source workbook at the end if that is required. Or just delete the source workbook if it's no longer needed once all required sheets are copied.

    All Workbooks & Sheets copy.
    All Workbooks and their Sheets in the source workbook are copied to the 'new' workbook.
    The 'new' workbook is created if it doesn't already exist otherwise the Sheets are just added in relation to the position of the default sheet or a selected sheet.
    Sheets can be copied to 'before' or 'after' an existing sheet. In the code sample a single default sheet is used and all sheets copied before this sheet so each new sheet should be in the same order as they are read from the source workbook. A new created workbook will have a single default sheet named 'Sheet1' by default (I rename it so it's less likely to cause the renaming of a copied sheet, see note below). You can add a new Sheet just for this positioning if the source workbook already exists with sheets you don't want moved.
    A criteria check could also be included so only those Sheets that match are copied from the source if so desired.

    Note Excel will rename sheets with a number if a copied sheet has the same name as an existing sheet, e.g. copying 'Sheet1' when it already exists, would cause it to be named 'Sheet1 (2)'. So if you want these to be named differently you'll need to check the names as Sheets are copied.

    import os
    import xlwings as xw
    from os.path import exists
    
    wb_target = 'NewWorkBook.xlsx'
    excel_dir = '<path to>/<ExcelFiles>/'
    
    with xw.App(visible=False) as app:
        ### Open target workbook or create it if not already exists
        if not exists(wb_target):
            wb_target = xw.Book()
            ws_target_default = wb_target.sheets.active
            # A name that should not clash with sheets in the source workbooks
            ws_target_default.name = "tobedeleted"
        else:
            wb_target = xw.Book(wb_target)
    
            ### Select active Sheet as default 
            ws_target_default = wb_target.sheets.active
            ### Or use an existing specified Sheet e.g. 'Existing1'
            # ws_target_default = wb_target.sheets('Existing1')
            ### Or create a new Sheet (called 'tobedeleted') as the last sheet in the workbook
            # ws_target_default = wb_target.sheets.add('tobedeleted', after=wb_target.sheet_names[len(wb_target.sheet_names)-1])
    
        for excel_file in os.listdir(excel_dir):
            if excel_file.endswith(".xlsx"):
                excel_path = os.path.join(excel_dir, excel_file)
    
                ### Open source Workbook
                wb_source = xw.Book(excel_path)
    
                for sheet in wb_source.sheets:
                    ### Select and move each Sheet in source workbook to new workbook
                    ws_source = wb_source.sheets[sheet]
                    # use 'before' to maintain sheet order
                    ws_source.copy(before=ws_target_default)  
    
                wb_source.close()  # Close the source workbook after Sheets copied
    
        ### Delete the "tobedeleted" default Sheet if its created
        if ws_target_default.name == "tobedeleted":
            ws_target_default.delete()  # Remove the position Sheet if exists
    
        ### Save the new workbook
        wb_target.save('NewWorkBook.xlsx')