pythonexcelpandasstyleframe

Read entire row if a specific column has background color in excel sheet using Python


I have an excel sheet which has few columns with background color. I need to fetch all rows which has background color in column B. I tried with styleframe but only able to pick specific cells from excel, not entire row.

from styleframe import StyleFrame, utils, Styler
cols = [1]
data = StyleFrame.read_excel(excel_file, read_style=True, use_openpyxl_style=false, usecols=cols)
print(data)
data.apply_style_by_index(data.index[1], Styler(bg_color='Yellow'), cols_to_style='Emp First Name')
print(data)

here is sample excel sheet

enter image description here

I am expecting whole rows (3,4,5,7,8,10,11,12 and 13), which has yellow background color in column B


Solution

  • Here is the code:

    import openpyxl
    wb = openpyxl.load_workbook('file.xlsx', data_only=True)
    ws = wb['Sheet1']
    df = pd.DataFrame(columns=['Dept', 'FNAME','LNAME', '2021', '2022', '2023'])
    for row in ws:
        if row[1].fill.start_color.index == "FFFFFF00":
            df.loc[len(df.index)] = [row[0].value, row[1].value, row[2].value, row[3].value, row[4].value, row[5].value]
    

    The yellow cells ONLY in column B will be picked up and the whole row (all 6 columns) of that row will be added to a dataframe. You can proceed with further actions using the dataframe df.

    My Excel sheet data:

    Excel data

    ...and the output dataframe

    >> df
    Dept    FNAME   LNAME   2021    2022    2023
    0   D2  User2   L2  100 200 300
    1   D3  User3   L3  100 200 300
    2   D4  User4   L4  100 200 300
    3   D4  User8   L8  100 200 300
    4   D2  User10  L10 100 200 300
    5   D3  User11  L11 100 200 300