While trying to export an R classifier to PMML, using the pmml package, I noticed that the class distribution for a node in the tree is not exported.
PMML supports this with the ScoreDistribution element: http://www.dmg.org/v1-1/treemodel.html
Is there anyway to have this information in the PMML? I want to read the PMML with another tool that depends on this information.
I'm doing something like:
library(randomForest)
library(pmml)
iris.rf <- randomForest(Species ~ ., data=iris, importance=TRUE,proximity=TRUE)
pmml(iris.rf)
Can you provide some more information..such as, which function you are trying to use.
For example, if you are using the randomForest package, I believe it doesn't provide information about the score distribution; so neither can the PMML representation. However, if you are using the default values, the parameter 'nodesize' for classification ceses, for example, equals 1 and that means the terminal node will have a ScoreDistribution such as:
ScoreDistribution value=predictedValue probability="1.0"/>
ScoreDistribution value=AnyOtherTargetCategoty probability="0.0"/>
If you are using the rpart tree model, the pmml function does output the score distribution information. Perhaps you can give us the exact commands you used?