python-3.xmlxtendfpgrowth

How to interpret results of Mlxtend's association rule


I am using mlxtend to find association rules:

Here is the code:

df = apriori(dum_data, min_support=0.4, use_colnames=True)
rules = association_rules(df, metric="lift", min_threshold=1)
rules2=rules[ (rules['lift'] >= 1) & (rules['confidence'] >= 0.7) ]

Output:

antecedents             consequents    antecedentsupport    consequentsupport   support confidence  lift      leverage  conviction
frozenset({'C'})        frozenset({'B'})        0.63        0.705                   0.45    0.726   1.030       0.013   1.077
frozenset({'A'})        frozenset({'B'})        0.98        0.705                   0.69    0.70    1.003       0.0007  1.00081
frozenset({'A', 'C'})   frozenset({'B'})        0.63        0.705                   0.45    0.72    1.030       0.013   1.0776

I have given a min support=0.4. What is the difference between antecedentsupport, consequentsupport and support?

What do mean by lift and leverage? How to judge if its good or bad?

Confidence I can understand that is how many times C and B occured together for first rule in output. ? Is that correct


Solution

  • Let's take the third rule ({A,C} => {B}) as an example:

    support = support of {A, B, C} | support means, that you count the number of transactions that contain all three of {A, B, C} and divide it by the total number of transactions.

    antecedentsupport = support of what precedes the =>, means support of {A,C}

    consequentsupport = support of what comes after the =>, means support of {B}

    confidence = how likely is it, that after we observed {A,C} that the transaction additionally contains {B}. Think of it as the conditional probability p(B given {A,C}).

    Lift: The definition for lift can e.g. be found here: wikipedia. This means, that if lift < 1 then {A,C} and {B} occur together less often than expected. If lift is larger than one then {A,C} and {B} appear together more often than expected.

    Leverage is roughly the same. It also compares the expected co-occurrence and the observed one. Further explanation e.g. here

    What makes a good lift/leverage is subjective but I'd suggest a lift of > 1. If it comes to rules I would look more at confidence.