If you only have a few rules in a discrete event simulation this is not critical but if you have a lot of them and they can interfere with each other and you may want to track the "which" and "where" they are used.
eval(parse(...)
?Here is an simple example which shows that I loose a factor 100 in speed. Assume you run a simulation and one (of many rules) is: Select the states with time less 5:
> a <- rnorm(100, 50, 10)
> print(summary(microbenchmark::microbenchmark(a[a < 5], times = 1000L, unit = "us")))
expr min lq mean median uq max neval
a[a < 5] 0.76 1.14 1.266745 1.141 1.52 11.404 1000
myfun <- function(a0) {
return(eval(parse(text = myrule)))
}
> myrule <- "a < a0" # The rule could be read from a file.
print(summary(microbenchmark::microbenchmark(a[myfun(5)], times = 1000L, unit = "us")))
expr min lq mean median uq max neval
a[myfun(5)] 137.61 140.271 145.6047 141.411 142.932 343.644 1000
Note: I don't think that I need an extra rete package which can do the book keeping efficiently. But if there are other opinions, let me know...
Let's profile this:
Rprof()
for (i in 1:1e4) a[myfun(5)]
Rprof(NULL)
summaryRprof()
#$by.self
# self.time self.pct total.time total.pct
#"parse" 0.36 69.23 0.48 92.31
#"structure" 0.04 7.69 0.06 11.54
#"myfun" 0.02 3.85 0.52 100.00
#"eval" 0.02 3.85 0.50 96.15
#"stopifnot" 0.02 3.85 0.06 11.54
#"%in%" 0.02 3.85 0.02 3.85
#"anyNA" 0.02 3.85 0.02 3.85
#"sys.parent" 0.02 3.85 0.02 3.85
#
#$by.total
# total.time total.pct self.time self.pct
#"myfun" 0.52 100.00 0.02 3.85
#"eval" 0.50 96.15 0.02 3.85
#"parse" 0.48 92.31 0.36 69.23
#"srcfilecopy" 0.12 23.08 0.00 0.00
#"structure" 0.06 11.54 0.04 7.69
#"stopifnot" 0.06 11.54 0.02 3.85
#".POSIXct" 0.06 11.54 0.00 0.00
#"Sys.time" 0.06 11.54 0.00 0.00
#"%in%" 0.02 3.85 0.02 3.85
#"anyNA" 0.02 3.85 0.02 3.85
#"sys.parent" 0.02 3.85 0.02 3.85
#"match.call" 0.02 3.85 0.00 0.00
#"sys.function" 0.02 3.85 0.00 0.00
Most of the time is spent in parse
. We can confirm this with a benchmark:
microbenchmark(a[myfun(5)], times = 1000L, unit = "us")
#Unit: microseconds
# expr min lq mean median uq max neval
# a[myfun(5)] 67.347 69.141 72.12806 69.909 70.933 160.303 1000
a0 <- 5
microbenchmark(parse(text = myrule), times = 1000L, unit = "us")
#Unit: microseconds
# expr min lq mean median uq max neval
# parse(text = myrule) 62.483 64.275 64.99432 64.787 65.299 132.903 1000
If reading the rules as text from a file is a hard requirement, I don't think there is a way to speed this up. Of course, you should not parse the same rule repeatedly, but I assume you now that.
Edit in response to a comment providing more explanation:
You should store your rules as quoted expressions (e.g., in a list using saveRDS
if you need them as a file):
myrule1 <- quote(a < a0)
myfun1 <- function(rule, a, a0) {eval(rule)}
microbenchmark(a[myfun1(myrule1, a, 30)], times = 1000L, unit = "us")
#Unit: microseconds
# expr min lq mean median uq max neval
# a[myfun1(myrule1, a, 30)] 1.792 2.049 2.286815 2.304 2.305 30.217 1000
For convenience, you could then make that list of expressions an S3 object and create a nice print
method for it in order to get a better overview.