I need to summarise the total number of people alive, the total number of deaths, and some other cumulative indicators, in a deterministic compartmental model with parameters that vary each year. I’m using approxfun
for the time-varying parameters but when I add an indicator to the model, this changes the summary statistics of all the other indicators.
To illustrate my problem, let's consider a model with two states, Susceptible (S) and Infectious (I), where people enter the S state at a rate b and people in the I state die at a rate m that varies every year. Below the code:
library(deSolve)
dt <- 0.5
times <- seq(from = 0, to = 100, by = dt)
# mortality rates change every year - I generate random values from a uniform distribution to illustrate this
set.seed(657)
mort_year <- runif(n=max(times), min=0, max=0.1)
# I use approxfun to get the right mortality rate at the right time
mort <- approxfun(mort_year, method = "constant", rule = 2)
# Initial number of individuals in each state
S_0 = c(999)
I_0 = c(1)
si_initial_state_values <- c(S = S_0,
I = I_0)
# Parameter values
si_parameters <- c(beta = 0.08, b = 0.05)
# Model
si_model <- function(time, state, parameters) {
with(as.list(c(state, parameters)), {
S <- state[1]
I <- state[2]
# Total population
N <- S + I
# Force of infection
lambda <- beta * I / N
#mortality rate
m <- mort(time)
# Solving the differential equations
dS <- b*N -lambda * S
dI <- lambda * S - m*I
list(c(dS, dI))
})
}
output <- ode(y = si_initial_state_values,
time = times,
func = si_model,
parms = si_parameters)
N <- output[,2:3]
sum(N)
[1] 5960657
The total number of people alive is 5960657.
Now I add another state M in the model to have the total number of deaths:
# Initial number of individuals in each state
S_0 = c(999)
I_0 = c(1)
M_0 = c(0)
si_initial_state_values <- c(S = S_0,
I = I_0,
M = M_0)
# Parameter values
si_parameters <- c(beta = 0.08, b = 0.05)
# Model
si_model <- function(time, state, parameters) {
with(as.list(c(state, parameters)), {
S <- state[1]
I <- state[2]
M <- state[3]
# Total population
N <- S + I
# Force of infection
lambda <- beta * I / N
#mortality rate
m <- mort(time)
# Solving the differential equations
dS <- b*N -lambda * S
dI <- lambda * S - m*I
dM <- m*I
list(c(dS, dI, dM))
})
}
output <- ode(y = si_initial_state_values,
time = times,
func = si_model,
parms = si_parameters)
N <- output[,2:3]
sum(N)
[1] 5960676
The total number of people alive is now 5960676.
The problem does not occur if the parameters do not vary over time. So I don’t see what’s wrong here.
In this example the difference is not that big, but in my original model with several states and several cumulative indicators the difference is much bigger.
First of all, this is a really nice example. A numerical solution of an ODE is always an approximation, that integrates a system with a certain error. For fixed step solvers like euler
or rk4
, the error results from the step size, while for solvers with automatic step size like the default lsoda
, the allowed error can be controlled with the options atol
and rtol
.
If an additional state is added (e.g. M
) it contributes to the error calculation and the step size estimation, so it is more than just an "indicator variable".
The discrete forcing variable mort
introduces another challenge to the solver. In theory, an ODE system is continuous by definition, but the mort
introduces hard jumps. The automatic step size solvers are quite robust in this matter, but they need to do more work by reducing step size for each discrete jump. This influences the integration error.
I see two possibilities to reduce the overall error. One way would be to re-formulate the model as differential algebraic equation (DAE) and solve it with daspk
. An introductory example can be found in the R Journal article from Soeatert et al (2010) and more in the deSolve package vignettes.
As another method, one could use the model formulation as is and tweak the tolerance parameters to reduce the influence of the state variable M
(see below).
In addition, I applied a few other modifications to the code:
mort
to a constant value. Outcomment it for comparison between varying and constant forcing,m
as external diagnostic variable for the plot,atol
and rtol
, trying to exclude M
from the error calculation,diagnostics()
to show the number of steps. We see that the solver needs a lot more steps with a time-varying forcing.As said, it would also be an interesting exercise to compare this with a DAE formulation of the model.
library(deSolve)
dt <- 0.5
times <- seq(from = 0, to = 100, by = dt)
# mortality rates change every year
set.seed(657)
mort_year <- runif(n=max(times), min=0, max=0.1)
## uncomment this to set mort to a constant value
# mort_year <- rep(0.5, length(times))
mort <- approxfun(mort_year, method = "constant", rule = 2)
si_initial_state_values <- c(S = 999,
I = 1,
M = 0)
si_parameters <- c(beta = 0.08, b = 0.05)
si_model <- function(time, state, parameters) {
with(as.list(c(state, parameters)), {
S <- state[1]
I <- state[2]
M <- state[3]
# Total population
N <- S + I
# Force of infection
lambda <- beta * I / N
# mortality rate
m <- mort(time)
dS <- b * N -lambda * S
dI <- lambda * S - m * I
dM <- m * I
list(c(dS, dI, dM), m = m)
})
}
output <- ode(y = si_initial_state_values,
time = times,
func = si_model,
parms = si_parameters,
rtol = c(1e-6, 1e-6, 0),
atol = c(1e-6, 1e-6, Inf))
plot(output)
## look at the number of steps and of function evaluations
diagnostics(output)
## total number of S and I
sum(output[,2:3] )