pythondataframeindex-errorindexer

"IndexError: .iloc requires numeric indexers, got [array([False, False, False, ..." in python. Why is it failing?


I am working with the implementation of Advances in Financial Machine Learning in order to get the scores of a cross validation in Python. The code I have is the next one:

cv = PurgedKFold(n_splits = 10,
             samples_info_sets = pd.Series(train['close_datetime'].values, index = train['opendatetime'].values),
             pct_embargo = 0.02)

scores = ml_cross_val_score(classifier = classifier,
                        X = X, y = y, cv_gen = cv)

The problem is that when I run the last command line, I get the next error:

IndexError: .iloc requires numeric indexers, got [array([False, False, False, ..., False, False, False])
 array([False, False, False, ..., False, False, False])
 array([False, False, False, ..., False, False, False]) ...
 array([False, False, False, ...,  True, False, False]) 8428
 array([False, False, False, ..., False, False,  True])]

Something is going wrong with my code, and maybe I am configuring bad the format of X and y dataframes for being inspected by the cross validator. Can anyone help me understanding why is that error being raised?


Solution

  • After some trials, I found the solution. This error happens due to the Purged K Fold needs the index value to be unique. If there are two indexes ('opendatetime') that are equal, at the time of splitting the different partitions of data, an error is araised.

    The solution is to check if there are rows with the same index. If you change the index value of those equal ocurrences to be different between them, it works!