This question is related to the following but I am unable to make it work
Linear Regression model building and prediction by group in R
Fit a model on each group and evaluate it using data from all rows not in this group, R
Essentially I have a dataframe structured as below. This has been generated by analysing my different experimental conditions on a "multiplex" machine, essentially generating intensity readouts for multiple different "cytokines" of interest. As part of this test I used standard concentrations of each cytokine to generate standard curves (generated by fitting a 5 parameter model). In my data these standards are saved in the column "conditions" and are labeled as (standard1, standard2 etc.). I would then like to apply these models to the intensity data for my "experiment" conditions to generate predicted "value" for my experiments.
test <- data.frame(
condition = as.factor(c("standard1", "standard2", "standard3", "standard4", "standard5", "standard6", "standard1", "standard2", "standard3", "standard4", "standard5", "standard6", "experiment1", "experiment2", "experiment3")),
cytokine = as.factor(c("IL1", "IL1", "IL1", "IL1", "IL1", "IL1", "CXCL1", "CXCL1", "CXCL1", "CXCL1", "CXCL1", "CXCL1", "IL1", "IL1", "CXCL1")),
value = as.numeric(c(1000, 500, 200, 100, 50, 25, 1500, 1000, 400, 300, 50, 20, NA, NA, NA)),
intensity = as.numeric(c(1.00, 0.6, 0.3, 0.2, 0.1, 0.05, 0.95, 0.87, 0.50, 0.4, 0.2, 0.1, 0.5, 0.7, 0.1)))
My real data is much longer and has many more conditions and cytokines
I have tried the following and think it works to this point
# Make a function to fit the 5PL model
fit_5pl <- function(data) {
drc::drm(
value ~ intensity,
data = data,
fct = drc::LL.5(names = c("Slope", "Lower", "Upper", "EC50", "Asym"))
)
}
# generate model for each cytokine
test_fit <- test %>%
nest(-cytokine) %>%
mutate(fit = map(data, fit_5pl))
I then get stuck trying to apply these models to the experimental data. I have tried creating a loop and also emulating the work in the given examples but just cannot manage it!
Any suggestions, especially using tidyverse or tidymodels would be greatly appreciated! Happy to also include what I have tried but non of it has worked thus far. Many thanks!
Perhaps you are looking for something like this, with a helping hand from broom::augment
.
test |>
# make a variable that separates the standards from the experimental data
mutate(type = ifelse(str_detect(condition, "standard"), "standard", "experiment")) |>
nest(.by = c(cytokine, type)) |>
# we want 1 row per cytokine
pivot_wider(names_from = type, values_from = data) |>
mutate(
# fit the standard curve
fit = map(standard, fit_5pl),
# then predict the values for the experimental condition
predicted = map2(fit, experiment, \(m, nd) broom::augment(m, newdata = nd))
) |>
unnest(predicted) |>
select(cytokine, condition, intensity, .fitted)
# A tibble: 3 × 4 cytokine condition intensity .fitted <fct> <fct> <dbl> <dbl> 1 IL1 experiment1 0.5 389. 2 IL1 experiment2 0.7 621. 3 CXCL1 experiment3 0.1 41.7
Or, if you want to predict over all values, it's simpler:
test |>
nest(.by = c(cytokine)) |>
mutate(
# fit the standard curve
fit = map(data, fit_5pl),
# then predict the values
predicted = map2(fit, data, \(m, nd) broom::augment(m, newdata = nd))
) |>
unnest(predicted) |>
select(cytokine, condition, intensity, value, .fitted)
# A tibble: 15 × 5 cytokine condition intensity value .fitted <fct> <fct> <dbl> <dbl> <dbl> 1 IL1 standard1 1 1000 1000. 2 IL1 standard2 0.6 500 502. 3 IL1 standard3 0.3 200 191. 4 IL1 standard4 0.2 100 111. 5 IL1 standard5 0.1 50 47.3 6 IL1 standard6 0.05 25 24.3 7 IL1 experiment1 0.5 NA 389. 8 IL1 experiment2 0.7 NA 621. 9 CXCL1 standard1 0.95 1500 1389. 10 CXCL1 standard2 0.87 1000 1153. 11 CXCL1 standard3 0.5 400 369. 12 CXCL1 standard4 0.4 300 239. 13 CXCL1 standard5 0.2 50 77.7 14 CXCL1 standard6 0.1 20 41.7 15 CXCL1 experiment3 0.1 NA 41.7