Context: tidyverse
and dplyr
environment/work-flow.
I'd appreciate insights into how to resolve the following issue, which I have encountered while trying to work with collections of regression results.
This minimal reproducible shows the issue
mtcars %>%
gamlss(mpg ~ hp + wt + disp, data = .) %>%
model.frame()
The example below illustrates a broader context and works as expected (producing the images shown). It also works if all I do is change ~lm(...)
to be ~glm(...)
or ~gam(...)
:
library(tidyverse)
library(broom)
library(gamlss)
library(datasets)
mtcars %>%
nest(-am) %>%
mutate(am = factor(am, levels = c(0, 1), labels = c("automatic", "manual")),
fit = map(data, ~lm(mpg ~ hp + wt + disp, data = .)),
results = map(fit, augment)) %>%
unnest(results) %>%
ggplot(aes(x = mpg, y = .fitted)) +
geom_abline(intercept = 0, slope = 1, alpha = .2) + # Line of perfect fit
geom_point() +
facet_grid(am ~ .) +
labs(x = "Miles Per Gallon", y = "Predicted Value") +
theme_bw()
However, If I try to use ~gamlss(...)
as follows:
mtcars %>%
nest(-am) %>%
mutate(am = factor(am, levels = c(0, 1), labels = c("automatic", "manual")),
fit = map(data, ~gamlss(mpg ~ hp + wt + disp, data = .)),
results = map(fit, augment)) %>%
unnest(results) %>%
ggplot(aes(x = mpg, y = .fitted)) +
geom_abline(intercept = 0, slope = 1, alpha = .2) + # Line of perfect fit
geom_point() +
facet_grid(am ~ .) +
labs(x = "Miles Per Gallon", y = "Predicted Value") +
theme_bw()
I observe the following error:
GAMLSS-RS iteration 1: Global Deviance = 58.7658
GAMLSS-RS iteration 2: Global Deviance = 58.7658
GAMLSS-RS iteration 1: Global Deviance = 76.2281
GAMLSS-RS iteration 2: Global Deviance = 76.2281
******************************************************************
Family: c("NO", "Normal")
Call: gamlss(formula = mpg ~ hp + wt + disp, data = .)
Fitting method: RS()
------------------------------------------------------------------
Mu link function: identity
Mu Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 43.811721 3.387118 12.935 4.05e-07 ***
hp 0.001768 0.021357 0.083 0.93584
wt -6.982534 1.998827 -3.493 0.00679 **
disp -0.019569 0.021460 -0.912 0.38559
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
------------------------------------------------------------------
Sigma link function: log
Sigma Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.8413 0.1961 4.29 0.00105 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
------------------------------------------------------------------
No. of observations in the fit: 13
Degrees of Freedom for the fit: 5
Residual Deg. of Freedom: 8
at cycle: 2
Global Deviance: 58.76579
AIC: 68.76579
SBC: 71.59054
******************************************************************
Error in mutate_impl(.data, dots) :
Evaluation error: object '.' not found.
In addition: Warning messages:
1: Deprecated: please use `purrr::possibly()` instead
2: Deprecated: please use `purrr::possibly()` instead
3: Deprecated: please use `purrr::possibly()` instead
4: Deprecated: please use `purrr::possibly()` instead
5: Deprecated: please use `purrr::possibly()` instead
6: In summary.gamlss(model) :
summary: vcov has failed, option qr is used instead
15: stop(list(message = "Evaluation error: object '.' not found.",
call = mutate_impl(.data, dots), cppstack = NULL))
14: .Call(`_dplyr_mutate_impl`, df, dots)
13: mutate_impl(.data, dots)
12: mutate.tbl_df(tbl_df(.data), ...)
11: mutate(tbl_df(.data), ...)
10: as.data.frame(mutate(tbl_df(.data), ...))
9: mutate.data.frame(., am = factor(am, levels = c(0, 1), labels = c("automatic",
"manual")), fit = map(data, ~gamlss(mpg ~ hp + wt + disp,
data = .)), results = map(fit, augment))
8: mutate(., am = factor(am, levels = c(0, 1), labels = c("automatic",
"manual")), fit = map(data, ~gamlss(mpg ~ hp + wt + disp,
data = .)), results = map(fit, augment))
7: function_list[[i]](value)
6: freduce(value, `_function_list`)
5: `_fseq`(`_lhs`)
4: eval(quote(`_fseq`(`_lhs`)), env, env)
3: eval(quote(`_fseq`(`_lhs`)), env, env)
2: withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
1: mtcars %>% nest(-am) %>% mutate(am = factor(am, levels = c(0,
1), labels = c("automatic", "manual")), fit = map(data, ~gamlss(mpg ~
hp + wt + disp, data = .)), results = map(fit, augment)) %>%
unnest(results) %>% ggplot(aes(x = mpg, y = .fitted))
Does anyone have suggestion as to what needs to change to make this example work as expected?
I'd appreciate any insights into what is going wrong. Why it does not work. How to diagnose this sort of issue(s).
Building on RolnadASc's partial answer...
The intermediate datasets and the chart output of the original problem are reproduced by the following. An improvement from using this approach is that we do not require the intermediate step creating fit
.
library(tidyverse)
library(broom)
library(gamlss)
library(datasets)
model.frame.gamlss <- function(formula, what = c("mu", "sigma", "nu", "tau"), parameter = NULL, ...)
{
object <- formula
dots <- list(...)
what <- if (!is.null(parameter)) {
match.arg(parameter, choices = c("mu", "sigma", "nu", "tau"))
} else match.arg(what)
Call <- object$call
parform <- formula(object, what)
data <- if (!is.null(Call$data)) {
# problem here, as Call$data is .
# eval(Call$data)
# instead, this would work:
eval(Call$data, environment(formula$mu.terms))
# (there is no formula$terms, just mu.terms and sigma.terms)
} else {
environment(formula$terms)
}
Terms <- terms(parform)
mf <- model.frame(
Terms,
data,
xlev = object[[paste(what, "xlevels", sep = ".")]]
)
mf
}
aug_func <- function(df){
augment(gamlss(mpg ~ hp + wt + disp, data=df))
}
mtcars %>%
mutate(am = factor(am, levels = c(0, 1), labels = c("automatic", "manual"))) %>%
group_by(am) %>%
do(aug=aug_func(df=.)) %>%
unnest(aug) %>%
ggplot(aes(x = mpg, y = .fitted)) +
geom_abline(intercept = 0, slope = 1, alpha = .2) + # Line of perfect fit
geom_point() +
facet_grid(am ~ .) +
labs(x = "Miles Per Gallon gamlss", y = "Predicted Value gamlss") +
theme_bw()