QUESTION EDITED FOR CLARITY AND REPRODUCIBILITY
I am trying to summarize proportions of landcover classes within many buffers contained within a list. Although it appears to be a common problem, I have not found an appropriate solution:
I have a raster stack called hab_stack
with discrete values 1-6 for each of 3 layers (each layer == year). I also have locational data with >800,000 locations called dat_sf
. I have extracted hab_stack
raster values within a 400 m buffer around each location.
I now have a large list with ~800,000 elements (not all hab classes 1-6 are represented in each list). So I tried to create a dummy dataframe with all hab_stack
values 1-6 called true_names
with assigned frequency/proportion == zero for classes not represented within the buffer because I need to combine all proportions together. I have tried to accomplish this using an lapply
looping structure but can't seem to get it quite right. Below is the full function and error:
sum_class <- lapply(values_hab, function(x){
true_names <- data.frame(x = 1:6, Freq = 0)
prop_df <- as.data.frame(prop.table(table(x))) %>%
mutate(x = as.numeric(x))
true_names %>%
anti_join(prop_df, by = "x") %>%
bind_rows(prop_df) %>%
arrange(x)
Error in `mutate()`:
! Problem while computing `x = as.numeric(x)`.
x `x` must be size 0 or 1, not 1659.
Run `rlang::last_error()` to see where the error occurred.
})
When I dissect the function, the error arises from the table(values_hab)
argument = Error in table(values_hab) : all arguments must have the same length
.
I think a hypothetical list could look something like this, where there's different numbers of NAs and not all classes are represented in each element; also, see a dataframe of my desired output below:
list <- list(c(1,1,1,2,2,2,3,3,4,4,4,NA,NA,NA,5,6),
c(1,2,3,4,NA,NA,NA,NA,4,4,4,4,NA,5,1,1)
c(5,5,5,5,5,1,2,2,2,2,NA,NA,NA,NA,NA,3))
desired_output <- data.frame(`1` = c(0.4, 0.5, 0.6, 0.5, 0.5, 0.3),
`2` = c(0.1, 0.1, 0.1, 0.1, 0.1, 0.2),
`3` = c(0.1, 0.1, 0.0, 0.1, 0.0, 0.3),
`4` = c(0.3, 0.2, 0.0, 0.1, 0.1, 0.1),
`5` = c(0.0, 0.1, 0.2, 0.2, 0.1, 0.0),
`6` = c(0.1, 0.0, 0.1, 0.0, 0.2, 0.1))
Any help is much appreciated. Best,
It looks like my function works and this was a very easy fix. dplyr::mutate
was recognizing x
as the entire list when in fact I wanted it to apply mutate the vector x
within each list. R is still running in the background but this should have taken care of it.
sum_class_function <- function(x){
true_names <- data.frame(x = 1:6, Freq = 0)
prop_df <- as.data.frame(prop.table(table(x)))
prop_df$x <- as.numeric(prop_df$x)
temp<- true_names %>%
anti_join(prop_df, by = "x") %>%
bind_rows(prop_df) %>%
arrange(x)
return(temp)
}
sum_class <- lapply(values_hab, sum_class_function)