pythonpandasipythonjupyter-notebookdataframe

How to read a .xlsx file using the pandas Library in iPython?


I want to read a .xlsx file using the Pandas Library of python and port the data to a postgreSQL table.

All I could do up until now is:

import pandas as pd
data = pd.ExcelFile("*File Name*")

Now I know that the step got executed successfully, but I want to know how i can parse the excel file that has been read so that I can understand how the data in the excel maps to the data in the variable data.
I learnt that data is a Dataframe object if I'm not wrong. So How do i parse this dataframe object to extract each line row by row.


Solution

  • I usually create a dictionary containing a DataFrame for every sheet:

    xl_file = pd.ExcelFile(file_name)
    
    dfs = {sheet_name: xl_file.parse(sheet_name) 
              for sheet_name in xl_file.sheet_names}
    

    Update: In pandas version 0.21.0+ you will get this behavior more cleanly by passing sheet_name=None to read_excel:

    dfs = pd.read_excel(file_name, sheet_name=None)
    

    In 0.20 and prior, this was sheetname rather than sheet_name (this is now deprecated in favor of the above):

    dfs = pd.read_excel(file_name, sheetname=None)