pythonpandasdataframeapriori

How to fix "DeprecationWarning: DataFrames with non-bool types result in worse computationalperformance..."


I have been trying to implement the Apriori Algorithm in python. There are several examples online, they all use similar methods and mostly the same example dataset. The reference link: https://www.kaggle.com/code/rockystats/apriori-algorithm-or-market-basket-analysis/notebook (starting from the line [26])

I have a different dataset that has the same structure as the example datasets online. I keep getting the

"DeprecationWarning: DataFrames with non-bool types result in worse computationalperformance and their support might be discontinued in the future.Please use a DataFrame with bool type"

error.

Here is my code:

 import pandas as pd
    import numpy as np
    from mlxtend.frequent_patterns import apriori, association_rules
    
    df1 = pd.read_csv(r'C:\Users\USER\dataset', sep=';')

    df=df1.fillna(0)
    basket = pd.pivot_table(data=df, index='cust_id', columns='Product', values='quantity', aggfunc='count',fill_value=0.0)
       
      def convert_into_binary(x):
        if x > 0:
            return 1
        else:
            return 0

          
       basket_sets = basket.applymap(convert_into_binary)
    
     
     frequent_itemsets = apriori(basket_sets, min_support=0.07, use_colnames=True)
    
    print(frequent_itemsets)
    
    # association rule
    rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1)
   
    print(rules)

In addition, in the last step of my code, I get an empty dataframe; I can see the column headings of the dataset but the output is empty.

Empty DataFrame Columns: [antecedents, consequents, antecedent support, consequent support, support, confidence, lift, leverage, conviction] Index: []

I am not sure if this issue is related to this error that I am having. I am new to python and I would really appreciate assistance and support on this issue.


Solution

  • I ran into the same issue even after converting my dataframe fields to 0 and 1.

    The fix was just making sure the apriori module knows the dataframe is of boolean type, so in your case you should run this :

        frequent_itemsets = apriori(basket_sets.astype('bool'), min_support=0.07, use_colnames=True)
    

    In addition, in the last step of my code, I get an empty dataframe; I can see the column headings of the dataset but the output is empty.

    Try using a smaller min_support