rfunctionformulasurvey

svyby - how to call in different variables for by formula in a function?


I would like to pass an assigned variable to a function that would be used as part of a formula in the svyby function from the survey package. My reason for wanting to do this is I want to get estimates for many different combinations of variables, with the variables being read in from a csv file. Every way I've tried, I get errors.

This is how it goes when I just use the variables directly into svyby (result is what I want):

library(survey)
library(tidyverse)

animals <- c("cat", "dog")
colors <- c("yellow", "blue")

df <- data.frame(wt = rbinom(1000, 3, 0.5),
                 a = sample(animals, 1000, replace=T) %>% as.factor,
                 b = sample(colors, 1000, replace=T) %>% as.factor,
                 year = rep(c(2022, 2023), 500))

df.design <- svydesign(
  ids = ~0,
  weights = ~wt,
  strata = NULL,
  data = df
)

counts <- svyby(~a,
                by = ~year + b,
                design = df.design,
                FUN = svytotal)

counts

            year      b acat adog  se.acat  se.adog
2022.blue   2022   blue  183  188 18.48926 18.08267
2023.blue   2023   blue  182  217 18.14401 20.20691
2022.yellow 2022 yellow  201  163 18.94619 17.39925
2023.yellow 2023 yellow  217  161 19.60345 16.65402

Now, I'd like to be able to pass the variables "analysis" and "demo" into a function that does the same thing. Trying it simply the way I thought it would work intuitively, I get an error:

analysis <- "a"
demo <- "b"

countsfunc <- function(analvar, demovar){
  svyby(~analvar,
    by = ~year + demovar,
    design = df.design,
    FUN = svytotal)
}

counts_try <- countsfunc(analysis, demo)

Error in model.frame.default(by, model.frame(design), na.action = na.pass) : 
  variable lengths differ (found for 'demovar')

I've tried a couple different combinations of as.formula, bquote, etc., but to no avail. Here's an example of something else I've tried:

countsfunc2 <- function(analvar, demovar){
  byfm = as.formula(paste("~year +", demovar))
  svyby(~analvar,
        by = byfm,
        design = df.design,
        FUN = svytotal)
}

counts_try2 <- countsfunc2(analysis, demo)

Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
 contrasts can be applied only to factors with 2 or more levels

Any ideas?


Solution

  • You were almost there. You need to make sure to use the as.formula() in the outcome variable as well:

    analysis <- "a"
    demo <- "b"
    countsfunc2 <- function(analvar, demovar){
      byfm = as.formula(paste("~year +", demovar))
      svyby(
        as.formula(paste("~", analvar)),
        by = byfm,
        design = df.design,
        FUN = svytotal)
    }
    
    (counts_try2 <- countsfunc2(analysis, demo))
    
                year      b acat adog  se.acat  se.adog
    2022.blue   2022   blue  197  187 18.12511 17.72979
    2023.blue   2023   blue  198  189 19.10919 18.42874
    2022.yellow 2022 yellow  164  177 17.24533 17.77602
    2023.yellow 2023 yellow  182  157 18.41780 16.32843