rdataframerepdummy-data

How do you make a dummy dataset in R?


How would I make a dataset where each "site", "season", "year", and "species name" is completely crossed (every site was visited during each year/season, and each species could have been caught anytime and place -> i.e. 5 sites x 2 seasons x 2 years x 2 species = df size is 40x5).

df <- data.frame(site = rep(c("1", "2", "3", "4", "5"), each = 2),
                 season = rep(c("dry", "wet"), each = 10), 
                 year = rep(c(2019, 2020), each = 10), 
                 species_name = rep(c("Sailfin molly", "Hardhead silverside"), each = 10),
                 num = sample(x = 0:15, size  = 20, replace = TRUE))

Solution

  • You could use the expand.grid() function:

    library(tidyverse)
    
    site <- c("1", "2", "3", "4", "5")
    season <- c("dry", "wet")
    year <-  c(2019, 2020)
    species_name <-  c("Sailfin molly", "Hardhead silverside")
    num <-  sample(x = 0:15, size  = 40, replace = TRUE)
    
    df <- data.frame(expand.grid(site, season, year, species_name) %>% mutate(num = num))
    colnames(df) <- c("site", "season", "year", "species_name", "num")