I am trying to fit an exponential decay curve to some biological data using nls()
within a function (I want to do this multiple times).
I receive an error with the way I am specifying nls()
which I can't work out:
Error in str2lang(x) : <text>:2:0: unexpected end of input
1: ~
^
A small dummy dataset and the code I have written is below - any help would be greatly appreciated.
dat_small <- structure(list(actual_time = c(5, 9, 30, 59, 119, 171, 216),
activity = c(7158, 7386, 5496, 3884, 1502, 819, 409)), row.names = c(NA,
-7L), class = c("tbl_df", "tbl", "data.frame"))
curve_fit <- function(x, y, df) {
#browser()
# Mono-exponential model for activity using actual sampling time
mod <- lm(log(df[[y]]) ~ df[[x]], data = df) # get starting values from log-linear model
C0 <- as.numeric(exp(mod$coef[1])) # exponential of log-linear intercept
lambda1 <- as.numeric(abs(mod$coef[2])) # absolute value of slope of log-linear model
nls_mod_mono <- nls(df[[y]] ~ C0*exp(-lambda1*df[[x]]), start = c(C0 = C0, lambda1 = lambda1), data = df)
summary(nls_mod_mono) # estimate model
xNew <- seq(0, 240, length.out = 100) # new grid of times
yNew <- predict(nls_mod_mono, list(x = xNew)) # predict activity
dfNew <- data.frame(x = xNew, y = yNew)
p_mono <- ggplot(dfNew, aes(x = x, y = y)) +
geom_line() +
geom_point(data = df, aes(x = x, y = y), size = 3) +
xlab("Actual Sampling Time (mins)") + ylab("Activity (Bq)") +
theme_bw(base_size = 20)
mono_half_life <- 0.693/as.numeric(coef(nls_mod_mono)[2]) # half-life
mono_auc <- as.numeric(coef(nls_mod_mono)[1])/as.numeric(coef(nls_mod_mono)[2]) # AUC to infinity (can be estimated in mono case by simply dividing the intercept by lambda)
results <- c(p_mono, mono_half_life, mono_auc)
}
curve_fit("actual_time", "activity", dat_small)
Questions to SO should have minimal code focused on the problem at hand. Here the question is how to avoid the error in the nls
call so everything after that is not relevant and we omit it. (We do point out, however, that aes
in ggplot2 accepts unquoted variables and since x
and y
are character write it like this aes(.data[[x]], .data[[y]])
.)
1) An easy fix is probably just to avoid the problem by defining a data frame with fixed x and y column names and then use that.
curve_fit1 <- function(x, y, df) {
data <- data.frame(x = df[[x]], y = df[[y]])
co <- coef(lm(log(y) ~ x, data))
st <- c(C0 = exp(co[[1]]), lambda1 = -co[[2]])
nls(y ~ C0 * exp(-lambda1 * x), data = data, start = st)
}
curve_fit1("actual_time", "activity", dat_small)
giving
Nonlinear regression model
model: y ~ C0 * exp(-lambda1 * x)
data: data
C0 lambda1
8.003e+03 1.301e-02
residual sum-of-squares: 270020
Number of iterations to convergence: 4
Achieved convergence tolerance: 4.228e-06
2) Even easier is to use the NLS.expoDecay
self starting function in the statforbiology package in which case no starting value is needed. The output is as in (1).
library(statforbiology)
curve_fit2 <- function(x, y, df) {
data <- data.frame(x = df[[x]], y = df[[y]])
nls(y ~ NLS.expoDecay(x, C0, lambda1), data)
}
curve_fit2("actual_time", "activity", dat_small)
3) We could alternately use substitute
to substitute the variables into the formula. This can apply to either (1) or (2). We will show it as applied to (2).
library(statforbiology)
curve_fit3 <- function(x, y, df) {
fo <- substitute(y ~ NLS.expoDecay(x, C0, lambda1),
list(x = as.name(x), y = as.name(y)))
nls(fo, df)
}
curve_fit3("actual_time", "activity", dat_small)
giving
Nonlinear regression model
model: activity ~ NLS.expoDecay(actual_time, C0, lambda1)
data: df
C0 lambda1
8.003e+03 1.301e-02
residual sum-of-squares: 270020
Number of iterations to convergence: 4
Achieved convergence tolerance: 4.228e-06
4) Yet another altertnative is to use the drc package together with DRC.exponDecay from statforbiology. In this case the simple form of formula accepted allows us to use reformulate
to generate it and furthermore the drc
class object produced from drm
has coef
, plot
and other methods -- try methods(class = "drc")
.
library(drc)
library(statforbiology)
curve_fit4 <- function(x, y, df) {
drm(reformulate(x, y), data = df, fct = DRC.expoDecay())
}
res <- curve_fit4("actual_time", "activity", dat_small)
plot(res)