I need to summarise the total number of people alive, the total number of deaths, and some other cumulative indicators, in a deterministic compartmental model with parameters that vary each year. I’m using approxfun for the time-varying parameters but when I add an indicator to the model, this changes the summary statistics of all the other indicators.
To illustrate my problem, let's consider a model with two states, Susceptible (S) and Infectious (I), where people enter the S state at a rate b and people in the I state die at a rate m that varies every year. Below the code:
library(deSolve)
dt <- 0.5
times <- seq(from = 0, to = 100, by = dt)
# mortality rates change every year - I generate random values from a uniform distribution to illustrate this
set.seed(657)
mort_year <- runif(n=max(times), min=0, max=0.1)
# I use approxfun to get the right mortality rate at the right time
mort <- approxfun(mort_year, method = "constant", rule = 2)
# Initial number of individuals in each state
S_0 = c(999)
I_0 = c(1)
si_initial_state_values <- c(S = S_0,
I = I_0)
# Parameter values
si_parameters <- c(beta = 0.08, b = 0.05)
# Model
si_model <- function(time, state, parameters) {
with(as.list(c(state, parameters)), {
S <- state[1]
I <- state[2]
# Total population
N <- S + I
# Force of infection
lambda <- beta * I / N
#mortality rate
m <- mort(time)
# Solving the differential equations
dS <- b*N -lambda * S
dI <- lambda * S - m*I
list(c(dS, dI))
})
}
output <- ode(y = si_initial_state_values,
time = times,
func = si_model,
parms = si_parameters)
N <- output[,2:3]
sum(N)
[1] 5960657
The total number of people alive is 5960657.
Now I add another state M in the model to have the total number of deaths:
# Initial number of individuals in each state
S_0 = c(999)
I_0 = c(1)
M_0 = c(0)
si_initial_state_values <- c(S = S_0,
I = I_0,
M = M_0)
# Parameter values
si_parameters <- c(beta = 0.08, b = 0.05)
# Model
si_model <- function(time, state, parameters) {
with(as.list(c(state, parameters)), {
S <- state[1]
I <- state[2]
M <- state[3]
# Total population
N <- S + I
# Force of infection
lambda <- beta * I / N
#mortality rate
m <- mort(time)
# Solving the differential equations
dS <- b*N -lambda * S
dI <- lambda * S - m*I
dM <- m*I
list(c(dS, dI, dM))
})
}
output <- ode(y = si_initial_state_values,
time = times,
func = si_model,
parms = si_parameters)
N <- output[,2:3]
sum(N)
[1] 5960676
The total number of people alive is now 5960676.
The problem does not occur if the parameters do not vary over time. So I don’t see what’s wrong here.
In this example the difference is not that big, but in my original model with several states and several cumulative indicators the difference is much bigger.
First of all, this is a really nice example. A numerical solution of an ODE is always an approximation, that integrates a system with a certain error. For fixed step solvers like
eulerorrk4, the error results from the step size, while for solvers with automatic step size like the defaultlsoda, the allowed error can be controlled with the optionsatolandrtol.If an additional state is added (e.g.
M) it contributes to the error calculation and the step size estimation, so it is more than just an "indicator variable".The discrete forcing variable
mortintroduces another challenge to the solver. In theory, an ODE system is continuous by definition, but themortintroduces hard jumps. The automatic step size solvers are quite robust in this matter, but they need to do more work by reducing step size for each discrete jump. This influences the integration error.I see two possibilities to reduce the overall error. One way would be to re-formulate the model as differential algebraic equation (DAE) and solve it with
daspk. An introductory example can be found in the R Journal article from Soeatert et al (2010) and more in the deSolve package vignettes.As another method, one could use the model formulation as is and tweak the tolerance parameters to reduce the influence of the state variable
M(see below).In addition, I applied a few other modifications to the code:
mortto a constant value. Outcomment it for comparison between varying and constant forcing,mas external diagnostic variable for the plot,atolandrtol, trying to excludeMfrom the error calculation,diagnostics()to show the number of steps. We see that the solver needs a lot more steps with a time-varying forcing.As said, it would also be an interesting exercise to compare this with a DAE formulation of the model.