rrandom-foresttreemodelpmml

R PMML class distribution


While trying to export an R classifier to PMML, using the pmml package, I noticed that the class distribution for a node in the tree is not exported.

PMML supports this with the ScoreDistribution element: http://www.dmg.org/v1-1/treemodel.html

Is there anyway to have this information in the PMML? I want to read the PMML with another tool that depends on this information.

I'm doing something like:

library(randomForest)
library(pmml)

iris.rf <- randomForest(Species ~ ., data=iris, importance=TRUE,proximity=TRUE)
pmml(iris.rf)

Solution

  • Can you provide some more information..such as, which function you are trying to use.

    For example, if you are using the randomForest package, I believe it doesn't provide information about the score distribution; so neither can the PMML representation. However, if you are using the default values, the parameter 'nodesize' for classification ceses, for example, equals 1 and that means the terminal node will have a ScoreDistribution such as:

    ScoreDistribution value=predictedValue probability="1.0"/>

    ScoreDistribution value=AnyOtherTargetCategoty probability="0.0"/>

    If you are using the rpart tree model, the pmml function does output the score distribution information. Perhaps you can give us the exact commands you used?