There is a randomForest model in R which I'd like to convert to pmml.
load("rf.RData")
r2pmml(rf, "file.pmml", compact=T)
gives the following result:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at org.jpmml.rexp.RExpParser.readIntVector(RExpParser.java:269) at org.jpmml.rexp.RExpParser.readRExp(RExpParser.java:88) at org.jpmml.rexp.RExpParser.readVector(RExpParser.java:329) at org.jpmml.rexp.RExpParser.readRExp(RExpParser.java:97) at org.jpmml.rexp.RExpParser.readVector(RExpParser.java:329) at org.jpmml.rexp.RExpParser.readRExp(RExpParser.java:97) at org.jpmml.rexp.RExpParser.parse(RExpParser.java:53) at com.r2pmml.Main.run(Main.java:83) at com.r2pmml.Main.main(Main.java:71) Fehler in .convert(tempfile, file, converter, converter_classpath, verbose) : The R2PMML conversion application has failed (error code 1). The Java executable should have printed more information about the failure into its standard output and/or standard error streams
Looks like it's a memory problem. My laptop has 8 GB RAM, the randomForest model is ~300 MB, R is version 4.3.2, and r2pmml version 0.27.1
I added
options(java.parameters = c("-Xms2G", "-Xmx8G"))
at the start of the R code to increase the available memory, and changed the code to
load("rf.RData")
decorate(rf, compact = F)
r2pmml(rf, "file.pmml", compact=T)
but it didn't change the outcome.
What now? Is there a way to convert the model on my laptop? If not, is there a simple (!) way to do it in the cloud?
Dump the model in R's built-in RDS data format into a file in local filesystem. Then, use the JPMML-R command-line application to perform the RDS-to-PMML conversion.
You can adjust JPMML-R's memory usage using standard Java/JVM command-line options:
$ java -Xms2G -Xmx8G pmml-rexp-example-executable-${version}.jar --rds-input RF.rds --pmml-output RF.pmml
Also, when dealing with large RF models, it is advisable to activate model compaction (ie. compact = TRUE
). However, the compaction pass runs after the standard conversion pass, so the in-memory model object still retains its original memory requirements (but the eventual PMML document is ~half the size).