rtidyversegtsummary

'INDEX' is of length zero error in gtsummary::tbl_svysummary


I have created a survey object via tidyverse manipulations and the srvyr package that has the following characteristics:

> dat_svy
Stratified 1 - level Cluster Sampling design (with replacement)
With (306) clusters.
Called via srvyr
Sampling variables:
  - ids: region 
  - strata: year 
  - weights: weights 
Data variables: 
  - year (dbl), region (dbl), weights (dbl), suicide1 (dbl), suicide2 (dbl),
    suicide3 (dbl), suicide4 (dbl), white (dbl), college_degree (dbl), male (dbl),
    birth_cohort (dbl), not_married (dbl), no_children (dbl), no_siblings (dbl),
    lives_alone (dbl), no_church (dbl), socially_isolated (dbl), no_afterlife
    (dbl), unhappy (dbl)

When I call the tbl_svysummary function from the gtsummary package on it, I get the error: "Error in tapply(y, by, sum, na.rm = na.rm, default = 0L) : 'INDEX' is of length zero".

I can't find anything on this error anywhere. I don't have any factor or character variables to worry about - tbl_svysummary just refuses to work. What do I need to change or what am I doing wrong here?

Simple dput of the top hundred rows - the dataframe is around 72,000 responses in full:

dat_dput <- structure(list(year = c(1972, 1972, 1972, 1972, 1972, 1972, 1972, 
1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 
1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 
1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 
1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 
1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 
1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 
1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 
1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 1972, 
1972, 1972, 1972, 1972, 1972), region = c(3, 3, 3, 3, 3, 3, 3, 
3, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 3, 3, 3, 3, 
3, 3, 3, 3, 3, 3, 3, 3, 3, 6, 6, 6, 6, 6, 6, 6, 3, 3, 3, 3, 3, 
3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 
1, 1, 1, 1, 1, 1, 1, 1, 4), weights = c(0.663196271930943, 0.917370028585327, 
0.897412512251031, 1.06634082743438, 0.94432371066466, 0.526887241987567, 
0.526887241987567, 0.546578869586901, 0.283198307048893, 0.494322145606406, 
0.494322145606406, 0.514302430419033, 0.342881139044641, 0.737740988776798, 
0.325847822011128, 0.344758822112228, 0.69963910150137, 0.775624523191156, 
1.15320013722553, 2.11211800992106, 0.64273816062543, 0.64273816062543, 
0.69963910150137, 0.848913248628709, 1.16102275743865, 0.94432371066466, 
0.739610183680864, 0.897412512251031, 0.847172098766174, 0.842558088614268, 
0.842558088614268, 1.02792370661946, 1.63985954597833, 1.11309898130309, 
0.847172098766174, 0.847172098766174, 0.94432371066466, 1.44345531723148, 
0.626115859108271, 1.22820170398128, 0.900385775275192, 1.01223292017645, 
1.18963602327628, 1.22151246676606, 1.40871806128592, 1.02792370661946, 
1.5807802341143, 1.10937369168099, 0.735581995192448, 1.10333163542922, 
1.10333163542922, 0.919680751859048, 0.735581995192448, 1.02792370661946, 
2.34789863846679, 0.919680751859048, 1.11590271768198, 1.11590271768198, 
0.914145070725301, 0.76937467894537, 0.753400464602581, 1.15401876109887, 
0.956302790407163, 0.602587164960281, 0.753400464602581, 0.939180577180875, 
0.919680751859048, 0.897412512251031, 1.38819734003634, 0.878833896874097, 
0.735581995192448, 1.11590271768198, 0.917370028585327, 0.939180577180875, 
2.10778598425346, 1.09328067639143, 1.33873713023916, 1.16835272322354, 
2.10778598425346, 1.17475084745831, 1.4711639903849, 0.897412512251031, 
0.459990562003962, 1.02792370661946, 1.05389299212673, 0.878833896874097, 
1.15760792327927, 1.98973798274558, 0.878833896874097, 0.685909878201486, 
0.878833896874097, 1.3267263503737, 1.3267263503737, 0.7289825566487, 
1.08429379105826, 0.692768867456301, 1.09343284866041, 0.884517387650749, 
0.861435127509909, 0.842072856888276), suicide1 = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_), suicide2 = c(NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), suicide3 = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_), suicide4 = c(NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), white = c(1, 
1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), college_degree = c(1, 
0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0), male = c(0, 1, 
0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 
0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 
0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 
0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 
0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0), birth_cohort = c(2, 
1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 
2, 2, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2, 2, 1, 1, 2, 1, 1, 1, 1, 1, 2, 
2, 1, 1, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2), not_married = c(1, 
0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 
0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), no_children = c(1, 
0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 
1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 
1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 
1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0), no_siblings = c(0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0), lives_alone = c(1, 
0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), no_church = c(0, 
0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 
0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0), socially_isolated = c(0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), no_afterlife = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_), unhappy = c(1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 
1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0)), row.names = c(NA, 100L), class = "data.frame")

dat_svy <- dat_dput |> 
as_survey_design(
ids = region, 
strata = year, 
weights = weights, 
nest = TRUE)

tbl_svysummary(dat_svy)

Solution

  • I am not sure exactly what is going on, but the issue is in trying to summarize the weights column: something you most often do not need to summarize is a descriptive table. (Also, the survey pkg is throwing a separate warning about the weights column, and this could be related.)

    If you exclude the weights column from the summary, you get the expected table. NOTE: There are many columns that are all missing in your data frame; hence, the errors when trying to summarize those columns.)

    library(gtsummary)
    
    dat_svy <- dat_dput |> 
      srvyr::as_survey_design(
        ids = region, 
        strata = year, 
        weights = weights, 
        nest = TRUE)
    names(dat_dput)
    #>  [1] "year"              "region"            "weights"          
    #>  [4] "suicide1"          "suicide2"          "suicide3"         
    #>  [7] "suicide4"          "white"             "college_degree"   
    #> [10] "male"              "birth_cohort"      "not_married"      
    #> [13] "no_children"       "no_siblings"       "lives_alone"      
    #> [16] "no_church"         "socially_isolated" "no_afterlife"     
    #> [19] "unhappy"
    
    tbl_svysummary(
      dat_svy,
      include = -weights
    ) |> 
      as_kable()
    
    #> Sample size greater than population size: are weights correctly scaled?
    #> The following errors were returned during `as_kable()`:
    #> ✖ For variable `no_afterlife` and "median", "p25", and "p75" statistics: error
    #>   in evaluating the argument 'x' in selecting a method for function 't':
    #>   missing value where TRUE/FALSE needed
    
    Characteristic N = 98
    year
    1972 98 (100%)
    region
    1 8 (8.2%)
    3 64 (66%)
    4 7 (6.9%)
    6 8 (7.8%)
    7 11 (11%)
    suicide1 NA (NA, NA)
    Unknown 98
    suicide2 NA (NA, NA)
    Unknown 98
    suicide3 NA (NA, NA)
    Unknown 98
    suicide4 NA (NA, NA)
    Unknown 98
    white 93 (96%)
    college_degree 16 (16%)
    male 45 (46%)
    birth_cohort
    1 72 (73%)
    2 26 (27%)
    not_married 23 (24%)
    no_children 30 (31%)
    no_siblings 4 (4.2%)
    lives_alone 4 (4.4%)
    no_church 12 (13%)
    socially_isolated
    0 98 (100%)
    no_afterlife NA (NA, NA)
    Unknown 98
    unhappy 9 (9.6%)

    Created on 2024-12-02 with reprex v2.1.1