I have to work on an Excel sheet with python so I can later put it in a database.
My problem is, that I have quite a lot of rows/cells with striketrough. I want to read the excel normally and flag every row with one or more striketrough cells. I just found a solution for skipping these rows but I want to flag it.
I did find a way to print the strikethrough rows using openpyxl
but I wanted to use pandas because I have to add and cut a lot of columns later on and pandas is best for that.
It's hard to imagine how looks like your spreadsheet but here is an example to give you the general logic. Basically, we use Font.strikethrough
from openpyxl to check if a the font of a cell is striked or not.
from openpyxl import load_workbook
ws = load_workbook("file.xlsx")["Sheet1"]
data = [[cell for cell in row] for row in ws.iter_rows(values_only=True)]
df = pd.DataFrame(data[1:], columns=data[0])
df["Flag"] = [
any(cell.font.strikethrough for cell in row) for row in ws.iter_rows(min_row=2)
]
Output :
print(df)
col1 col2 col3 Flag
0 foo 1 test1 True
1 bar 2 test2 False
2 qux 3 test3 True
3 baz 4 test4 False
Input used :