I am trying to figure out how to create new columns based on other columns and insert another column's value as observations in the newly created columns. Not sure how to clearly explain this in words so I will show an example.
Basically my data looks like this:
code | spec | grid | month | day | depth | number |
---|---|---|---|---|---|---|
01 | 31 | 1 | 6 | 17 | 5 | 1 |
01 | 33 | 1 | 6 | 17 | 5 | 2 |
01 | 45 | 1 | 6 | 17 | 10 | 15 |
02 | 45 | 12 | 6 | 17 | 10 | 34 |
02 | 45 | 12 | 7 | 19 | 15 | 1 |
03 | 31 | 15 | 8 | 27 | 15 | 30 |
03 | 33 | 16 | 9 | 13 | 20 | 34 |
03 | 31 | 18 | 10 | 17 | 25 | 100 |
My dataset has 2,514 rows.
The 'spec' column is the species code. I would like to create new columns with the actual species name (for each species code) and its respective number as observations(rows).
For example (simplified/mock data for privacy purposes). Let's say the species code: 31 == mosquito, 33 == lion, 45 == fish....etc. I would like my data to end up like this:
code | mosquito | lion | fish | grid | month | day | depth |
---|---|---|---|---|---|---|---|
01 | 1 | 0 | 0 | 1 | 6 | 17 | 5 |
01 | 0 | 2 | 0 | 1 | 6 | 17 | 5 |
01 | 0 | 0 | 15 | 1 | 6 | 17 | 10 |
02 | 0 | 0 | 34 | 12 | 6 | 17 | 10 |
02 | 0 | 0 | 1 | 12 | 7 | 19 | 15 |
03 | 30 | 0 | 0 | 15 | 8 | 27 | 15 |
03 | 0 | 34 | 0 | 16 | 9 | 13 | 20 |
03 | 100 | 0 | 0 | 18 | 10 | 17 | 25 |
I have looked online for a few hours but I haven't been able to find a clear answer... I have found simpler solutions but not something that meets my question.
I have played around with case_when, if_else, and sapply but can't seem to make it work. I hope this question is clear. Happy to clarify further. Any suggestions? Thank you!
You could do:
library(tidyverse)
df %>%
mutate(spec = case_when(
spec == 31 ~ 'mosquito',
spec == 33 ~ 'lion',
spec == 45 ~ 'fish',
TRUE ~ NA_character_),
id = row_number()) %>%
pivot_wider(names_from = spec, values_from = number) %>%
mutate(across(everything(), ~ replace_na(.x, 0)))
#> # A tibble: 8 x 9
#> code grid month day depth id mosquito lion fish
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 6 17 5 1 1 0 0
#> 2 1 1 6 17 5 2 0 2 0
#> 3 1 1 6 17 10 3 0 0 15
#> 4 2 12 6 17 10 4 0 0 34
#> 5 2 12 7 19 15 5 0 0 1
#> 6 3 15 8 27 15 6 30 0 0
#> 7 3 16 9 13 20 7 0 34 0
#> 8 3 18 10 17 25 8 100 0 0
Note that you need to map your other species to the spec
number inside case_when
Created on 2022-04-10 by the reprex package (v2.0.1)
Data from question in reproducible format:
``` r
df <- structure(list(code = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L), spec = c(31L,
33L, 45L, 45L, 45L, 31L, 33L, 31L), grid = c(1L, 1L, 1L, 12L,
12L, 15L, 16L, 18L), month = c(6L, 6L, 6L, 6L, 7L, 8L, 9L, 10L
), day = c(17L, 17L, 17L, 17L, 19L, 27L, 13L, 17L), depth = c(5L,
5L, 10L, 10L, 15L, 15L, 20L, 25L), number = c(1L, 2L, 15L, 34L,
1L, 30L, 34L, 100L)), class = "data.frame", row.names = c(NA,
-8L))
df
#> code spec grid month day depth number
#> 1 1 31 1 6 17 5 1
#> 2 1 33 1 6 17 5 2
#> 3 1 45 1 6 17 10 15
#> 4 2 45 12 6 17 10 34
#> 5 2 45 12 7 19 15 1
#> 6 3 31 15 8 27 15 30
#> 7 3 33 16 9 13 20 34
#> 8 3 31 18 10 17 25 100