pythonpandasindexingpandas-loc

How can Pandas .loc take three arguments?


I am looking at someones code and this is what they wrote

from financetoolkit import Toolkit

API_KEY = "FINANCIAL_MODELING_PREP_API_KEY"

companies = Toolkit(["AAPL", "MSFT", "GOOGL", "AMZN"], api_key=API_KEY, start_date="2005-01-01")
income_statement_growth = companies.get_income_statement(growth=True)
display(income_statement_growth.loc[:, "Revenue", :])

Essentially what this code does is it returns the revenue value for a couple of companies starting from 2005 until present day.

What I am confused about is income_statement_growth.loc[:, "Revenue", :]

why is there three arguments? I dont understand what the third colon is doing in this code

All the documentation I read about .loc states that it takes two arguments, one for the row and one for the column, so I am a bit confused how it is able to take three and what the function of the third colon is.


Solution

  • The return value of companies.get_income_statement(growth=True) is a pandas DataFrame with a multi-index. The columns are indexed by period ('2019', '2020', etc.) and the rows are indexed by a combination of company ticker and data item (e.g. ('AAPT', 'Revenue')).

    You could access a single element like this:

    print(income_statement_growth['2020'][('AAPL', 'Revenue')])
    

    And to select the 'Revenue' for all tickers and all periods, you use .loc:

    revenues = income_statement_growth.loc[:, 'Revenue', :]
    

    For a simple dataframe, you world normally see two arguments for .loc[] but since this is a multi-index, this needs three arguments.