In the context of ordereddict
, I'd like to loop through all of the sheets in my Excel file, read the content of the B2 cell from each sheet, and print it out. Later, I plan to set conditions based on the string value of that cell.
# Importing modules
import openpyxl as op
import pandas as pd
import numpy as np
import xlsxwriter
import openpyxl
from openpyxl import Workbook, load_workbook
# Defining the file path
file_path = r'C:/Users/machukovich/Desktop/stackoverflow.xlsx'
# Load workbook as openpyxl
reference_workbook = load_workbook(file_path)
# We will mantain the workbook open
wb = reference_workbook.active
# Loading the file into a dictionary of Dataframes
dict_of_df = pd.read_excel(file_path, sheet_name=None, skiprows=2)
# Writting the loop itself (it only prints Leonardo)
for sheet_name, df in dict_of_df.items():
student_name = wb['B2'].value
print(f"Student Name: {student_name}")
From https://openpyxl.readthedocs.io/en/stable/tutorial.html#create-a-workbook :
A workbook is always created with at least one worksheet. You can get it by using the Workbook.active property:
ws = wb.active
Info: This is set to 0 by default. Unless you modify its value, you will always get the first worksheet by using this method.
You never change reference_workbook.active
and wb
is always the first worksheet of the workbook. Then, additionally, you use pandas to load all worksheets by (correctly) passing None as sheet_name. You also iterate over them correctly, but inside the loop you always access wb
without actually using the pandas dataframe of the worksheet df
. So you basically mixed two solutions.
So you have two options:
1: Remove pd.load_excel and use openpyxl by iterating over the worksheets with something like
for sheet in reference_workbook:
print(sheet["B2"])
2: Use your OrderedDict with pandas dataframes by accessing df
instead of wb
. However, because you passed skiprows=2
you cut off B2 and the sheet will start at row 3.