We have an OpenCPU cloud server, installed on a RedHat server with Apache 2.0 and rApache, which runs some quite memory- and processing-intensive calculations. Our app runs rather slowly (slower than on a less powerful laptop) - we think this is because of the memory allocation on the server.
For this reason we parallelized the app for the server (using the parallel
package), but even though normally one can run many (more than 20) parallel R jobs on the server, our app can only run around 18.
In order to understand what is going on, my question is: when I call an R function through the OpenCPU web interface, which component of the server creates/spawns R processes and manages their memory allocation? Is it r_mod
or the Apache server itself, through some other modules? Does the Prefork
MPM have an effect on this (based on this answer)? Which part of this work is done by OpenCPU?
I read the OpenCPU documentation, rApache documentation, all stackoverflow questions on OpenCPU, but I didn't manage to understand how R processes are managed in particular. Sorry if I missed something, I'd be really grateful if anybody could point me to the source of this information.
The slowness can be a result of application requiring packages that are not being preloaded, hence they need to get loaded for each request, over and over again.
To speed things up, try adding your package to the preload
in /etc/opencpu/server.conf
or add preprocessing R code to /etc/opencpu/Rprofile
that loads the required packages / data.
Answering your question:
n
is configurable in Apache using StartServers
, MinSpareServers
, MaxSpareServers
, MaxRequestWorkers
, and so on. Because each R worker uses a lot of resources this shouldn't be set too high.preload
packages, and runs /etc/opencpu/Rprofile
. Hence in total it uses n
times the amount of memory it take to load those things in R.