pythonpandas

Dataframe ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()


I have a dataframe (reportingDatesDf) which the head looks like this:

           unique_stock_id reporting_type
date                                     
2015-01-28  BBG.MTAA.STM.S         2014:A
2015-01-28  BBG.MTAA.STM.S        2014:S2
2015-01-28  BBG.MTAA.STM.S        2014:Q4
2014-10-29  BBG.MTAA.STM.S        2014:C3
2014-10-29  BBG.MTAA.STM.S        2014:Q3

I am trying to reduce the dataframe to include entries that are only between 2 dates with the following line:

reportingDatesDf = reportingDatesDf[(reportingDatesDf.index >= startDate) and (reportingDatesDf.index <= endDate)]

The dataframe is created from a CSV using the following code:

def getReportingDatesData(rawStaticDataPath,startDate,endDate):
    pattern = 'ReportingDates'+ '.csv'
    staticPath = rawStaticDataPath
    
    with open(staticPath+pattern,'rt') as f:
        
         reportingDatesDf = pd.read_csv(f, 
                 header=None,
                 usecols=[0,1,2],
                 parse_dates=[1],
                 dayfirst=True,
                 index_col=[1],
                 names=['unique_stock_id','date','reporting_type'])       
         #print(reportingDatesDf.head())
         print('reportingDatesDf.index',reportingDatesDf)      
         reportingDatesDf = reportingDatesDf[(reportingDatesDf.index >= startDate) and (reportingDatesDf.index <= endDate)]

I however get the error:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Why has this happened? I am using similar code elsewhere which works.


Solution

  • and doesn't broadcast. It can't, because it has to short-circuit, and there's no good answer for making the short-circuiting broadcast.

    If you need to do an elementwise and, you should use & instead.