spssspss-modeler

Overlapping Nodes in CHAID (Decision Tree) in SPSS Modeler


I occasionally encounter nodes in CHAID models (in SPSS Modeler) that seem to have overlapping values, such as:

enter image description here

Above, the split is on a continuous variable ("Fulfillment in: Working at a job..." etc. is based on a Likert-scaled item). I'm unclear about how to interpret the nodes---for example, Node 4 is <= 5.000 but Node 5 is 5.000,6.000. I notice that there are brackets but don't know what they represent.

Or is this because I've configured the build-options incorrectly? They are currently set to:

Thank you in advance for any guidance.


Solution

  • There aren't any overlaps. SPSS uses the same notation for ranges of values (a.k.a. "intervals") as you might find used in a calculus course. The rounded parentheses indicate that the interval does not include the endpoint, while square brackets indicate that the endpoint lies within the interval.

    So, the middle node of the tree is marked "(6, 7]" but, since the variable values are integers, it's really just cases with a value of 7 that fall into that node. For a Likert-scaled item such as this one, you may want to tell SPSS to treat the variable as ordinal, rather than continuous.