pythonexcelopenpyxl

Update formulas in excel using Python


I am trying to update formulas in an existing excel which has data and tables but I am failing to update the formulas so that they would update the data for example:

Okay since I get some answers but not exactly what I'm trying to achieve I will try to get more context here:

I am trying to make a unique list from a table like so:

=UNIQUE(Table1[Names])

And then I would use that data for a validation list. Meaning that I would create one DV for the names in unique formula. Let's say I place the formula in C1. A and B columns are Names and Age:

dropdown_range = "'Data'!C1#"
dv = DataValidation(type="list", formula1=dropdown_range, showDropDown=False)
target_sheet.add_data_validation(dv)
dv.add(target_sheet['D1'])

Then I want to use the dropdown (DV) selection in another formula which would check the selection:

=VSTACK("", UNIQUE(FILTER(Table1[Age], Table1[Names]=D1)))

The logic would be this but the formulas are much more complex.

The full code:

from openpyxl import Workbook
from openpyxl.worksheet.datavalidation import DataValidation
from openpyxl.worksheet.table import Table, TableStyleInfo

wb = Workbook()
ws = wb.active

data = [
    ["Names", "Age"],
    ["Alice", 30],
    ["Bob", 25],
    ["Charlie", 35],
    ["Alice", 30],
    ["David", 40],
    ["Bob", 25],
]

for row in data:
    ws.append(row)

table_range = f"A1:B{len(data)}"
table = Table(displayName="Table1", ref=table_range)

style = TableStyleInfo(name="TableStyleMedium9", showFirstColumn=True, showRowStripes=True, showColumnStripes=True)
table.tableStyleInfo = style

ws.add_table(table)

ws["C1"] = "=UNIQUE(Table1[Names])"

dropdown_range = "'Sheet'!C1#"
dv = DataValidation(type="list", formula1=dropdown_range, showDropDown=True)
ws.add_data_validation(dv)

dv.add(ws["D1"])

ws["E1"] = "=VSTACK(\"\", UNIQUE(FILTER(Table1[Age], Table1[Names]=D1)))"

file_path = 'example_with_dropdown.xlsx'
wb.save(file_path)

Is there a way to get this done with some other module rather than openpyxl? After openpyxl does the majority of the work with the excel?

OKAY.. Since this took too much time I just created a mapping of all possible dependent values for each selection.


Solution

  • To add the UNIQUE function you need add it as a Array Formula.
    You could enter the formula to a single cell but if the unique values are more than one cell it will only show the first unique value. An Array allows every value to be assigned a cell and the range should be no bigger than the 'Names' range of the Table of course.

    I have written the following code to create a workbook adding the UNIQUE formula as an example. The workbook adds its own Sheets, test data and is saved with the name 'Example.xlsx'

    The code will insert an Excel Table, 'Table1' using your Header name and then applies the UNIQUE formula as you have written it against the Table in Column E under the Header 'Unique'.

    I have also included the DV you are attempting to add. I have moved it to Column F, Cell F2 in this example since the table overwrites 'B3'

    If the formula/function is unknown as seems to be the case with 'UNIQUE', prefix with _xlfn, see formula variable in code.

    import openpyxl
    from openpyxl.styles import Alignment
    from openpyxl.worksheet.datavalidation import DataValidation
    from openpyxl.worksheet.formula import ArrayFormula
    from openpyxl.worksheet.table import Table
    
    # Workbook and Worksheets
    wb = openpyxl.Workbook()
    target_sheet = wb.active
    wb.create_sheet('Data')
    ws = wb.worksheets[1]
    
    # Add some data to the Sheets
    # 'Data' Sheet
    for a in range(1, 11):
        ws[f"A{a}"].value = a
    
    # Table data on target_sheet, ('Sheet') 
    rows = [
        ('Names', 'ColumnA', 'ColumnB'),
        ('Tom', 0.51, 'Value1'),
        ('Fred', 0.26, 'Value2'),
        ('Tom', 0.07, 'Value1'),
        ('Vicky', 0.07, 'Value3'),
    ]
    for r in rows:
        target_sheet.append(r)
    
    # Create Table
    tbl = Table(displayName="Table1", ref="A1:C5")
    target_sheet.add_table(tbl)
    
    # Add the Array Formula
    formula = "=_xlfn.UNIQUE(Table1[Names])"
    target_sheet["E2"].value = ArrayFormula('E2:E5', formula)
    
    # Data Validation
    dropdown_range = "'Data'!A1:A10"
    dv = DataValidation(type="list", formula1=dropdown_range, showDropDown=False)
    
    target_sheet.add_data_validation(dv)
    dv.add(target_sheet['F2'])  # Add DV to cell 'F2'
    
    # Some cell formatting
    for x in ['A', 'B', 'C']:
        target_sheet.column_dimensions[x].width = 12
    
    target_sheet['E1'].value = 'Unique'
    target_sheet['E1'].alignment = Alignment(horizontal='center')
    
    target_sheet['F1'].value = 'DV'
    target_sheet['F1'].alignment = Alignment(horizontal='center')
    
    # Save workbook
    wb.save('Example.xlsx')
    

    Example target_sheet
    Showing the formula, this is the same for each cell E2 - E5
    Example Sheet Unique Formula

    Showing the DV dropdown. Data shown is from the 2nd Sheet 'Data' cells A1 - A10
    Example Sheet DV

    EDIT
    The stated requirements in your updated Post is covered in the below examples.

    Example 2
    Below I have updated the example;
    As noted in my comment on your Post, Openpyxl does not evaluate formulas so the UNIQUE formula will not be filled until the workbook is opened in Excel.

    The Array formula range is likely to be bigger than it needs to be and thus contains '#N/A' values for the unused cells, which then get inserted into the DV since it also has to cover the same range as the Array given you cannot know at the time what cells contain #N/A .

    Example code 2

    import openpyxl
    from openpyxl.styles import Alignment, Font
    from openpyxl.worksheet.datavalidation import DataValidation
    from openpyxl.worksheet.formula import ArrayFormula
    from openpyxl.worksheet.table import Table, TableStyleInfo
    
    wb = openpyxl.Workbook()
    ws = wb.active
    
    data = [
        ["Names", "Age"],
        ["Alice", 30],
        ["Bob", 25],
        ["Charlie", 35],
        ["Alice", 30],
        ["David", 40],
        ["Bob", 25],
    ]
    
    for row in data:
        ws.append(row)
    
    table_range = f"A1:B{len(data)}"
    table = Table(displayName="Table1", ref=table_range)
    
    style = TableStyleInfo(name="TableStyleMedium9", showFirstColumn=True, showRowStripes=True, showColumnStripes=True)
    table.tableStyleInfo = style
    
    ws.add_table(table)
    
    # Add the Array Formula
    formula = "=_xlfn.UNIQUE(Table1[Names])"
    ws["E2"].value = ArrayFormula('E2:E7', formula)
    
    dropdown_range = "'Sheet'!E2:E7"
    dv = DataValidation(type="list", formula1=dropdown_range, showDropDown=False)
    
    ws.add_data_validation(dv)
    dv.add(ws['F2'])
    
    # Some cell formatting
    for x in ['A', 'B']:
        ws.column_dimensions[x].width = 12
    
    ws['E1'].value = 'Unique'
    ws['E1'].font = Font(bold=True)
    ws['E1'].alignment = Alignment(horizontal='center')
    
    ws['F1'].value = 'DV'
    ws['F1'].font = Font(bold=True)
    ws['F1'].alignment = Alignment(horizontal='center')
    
    file_path = 'example_with_dropdown.xlsx'
    wb.save(file_path)
    

    Updated example Sheet 2
    New Example Sheet

    Example 3
    If you want to go the extra mile and use another module to clean up your DV a bit then here is the example using Xlwings to check the unique data range and setting the DV to the range of actual names only.
    Since I am using Xlwings to open the workbook in Excel (to update the Unique List I have used the same to insert the DV.

    However, if you still wanted to use Openpyxl to add the DV/other detail you would need to open the workbook in Excel and save it then re-open (load_workbook) the saved workbook using Openpyxl.
    It should not be necessary to re-open with data_only=True as only the first cell 'E2' should have the formula. The rest of the cells, in this case 'E3:E7' should have the name or '#N/A' as it's value.

    You can then get the max used row in the Unique list as done in this code and then use Openpyxl to update the DV entry.

    Example code 3

    import openpyxl
    import xlwings as xw
    from openpyxl.styles import Alignment, Font
    from openpyxl.worksheet.formula import ArrayFormula
    from openpyxl.worksheet.table import Table, TableStyleInfo
    from xlwings import constants
    
    
    file_path = 'example_with_dropdown.xlsx'
    
    wb = openpyxl.Workbook()
    ws = wb.active
    
    data = [
        ["Names", "Age"],
        ["Alice", 30],
        ["Bob", 25],
        ["Charlie", 35],
        ["Alice", 30],
        ["David", 40],
        ["Bob", 25],
    ]
    
    for row in data:
        ws.append(row)
    
    table_range = f"A1:B{len(data)}"
    table = Table(displayName="Table1", ref=table_range)
    
    style = TableStyleInfo(name="TableStyleMedium9", showFirstColumn=True, showRowStripes=True, showColumnStripes=True)
    table.tableStyleInfo = style
    
    ws.add_table(table)
    
    # Add the Array Formula
    formula = "=_xlfn.UNIQUE(Table1[Names])"
    formula_rng = 'E2:E7'
    ws["E2"].value = ArrayFormula(formula_rng, formula)
    
    # Some cell formatting
    for x in ['A', 'B']:
        ws.column_dimensions[x].width = 12
    
    ws['E1'].value = 'Unique'
    ws['E1'].font = Font(bold=True)
    ws['E1'].alignment = Alignment(horizontal='center')
    
    ws['F1'].value = 'DV'
    ws['F1'].font = Font(bold=True)
    ws['F1'].alignment = Alignment(horizontal='center')
    
    wb.save(file_path)
    
    # Open workbook in Xlwings to Update UNIQUE formula cells to check which cells have actual names
    with xw.App(visible=False) as app:
        wb2 = xw.Book(file_path)
        ws2 = wb2.sheets['Sheet']
    
        # Get the last row in the Unique list with a valid name.
        max_used_row = max([x.row for x in ws2.range(formula_rng) if x.value is not None])
    
        # Set the range of the DV to cover start to last row determined above
        dv_range = f'=E2:E{max_used_row}'
    
        # Add the DV to the Sheet
        ws2.range('F2').api.Validation.Add(
            Type=constants.DVType.xlValidateList,
            AlertStyle=constants.DVAlertStyle.xlValidAlertStop,
            Operator=constants.FormatConditionOperator.xlBetween,
            # Formula1="=$E$2:$E$5"
            Formula1 = dv_range
        )
        ws2.range('F2').IgnoreBlank = True
        ws2.range('F2').api.Validation.InCellDropdown = True
        ws2.range('F2').api.Validation.InputTitle = ""
        ws2.range('F2').api.Validation.ErrorTitle = ""
        ws2.range('F2').api.Validation.InputMessage = ""
        ws2.range('F2').api.Validation.ErrorMessage = ""
        ws2.range('F2').api.Validation.ShowInput = True
        ws2.range('F2').api.Validation.ShowError = True
    
        # Save workbook with updated DV
        wb2.save(file_path)
    

    Updated example Sheet 2
    Sheet with extra #N/A removed from the DV