rstatisticsgroupingdata-handling

Grouping regions according to area codes in statistical program R


I want to create an area group according to the areacode. By doing this, all area_g becomes A. Only codes 50110250, 50110253, 50110310 should be A! and 50110101~50110140 should be 'B'. what's wrong... This is the code I wrote. Thank you.

     AAA <- AAA %>%
  
      mutate(AAA, area_g = ifelse(areacode==50110250|50110253|50110310, "A",
                              ifelse(areacode==50110101:50110140, "B",
                               ifelse(areacode==50110256|50110253, "C",
                                ifelse(areacode==50130250|50130310, "D",
                                 ifelse(areacode==50130101:50130122, "E",
                                  ifelse(areacode==50130253|50130259|50130320, "F")))))))   

Solution

  • As mentioned in the comments:

    1. it's best to avoid nesting ifelse functions, because dplyr::case_when() is more legible and less error-prone
    2. the %in% operator will assess whether the left-hand side is present in the vector on the right hand-side.
    library(dplyr)
    # Simulate data
    AAA <- data.frame(areacode = 0:10)
    
    AAA |> 
      mutate(area_g = case_when(areacode == 0 ~ "A",
                                areacode %in% 1:3 ~ "B",
                                areacode == 4 ~ "C",
                                areacode %in% 5:7 ~ "D",
                                TRUE ~ "E"))
    
    #>    areacode area_g
    #> 1         0      A
    #> 2         1      B
    #> 3         2      B
    #> 4         3      B
    #> 5         4      C
    #> 6         5      D
    #> 7         6      D
    #> 8         7      D
    #> 9         8      E
    #> 10        9      E
    #> 11       10      E
    

    Created on 2022-06-24 by the reprex package (v2.0.1)