rdplyrr-haven

ifelse removes labels of a labelled variable


I'm fairly new to R and trying to recode missing values (stored as -99) to NA. For some reasons this removes all my variable labels from the data frame.

My code is the following

df <- df %>%
  mutate(across(everything(), ~ ifelse(. == -99, NA, .)))

Is their any way to work around this or possibly use another command? Thank you very much in advance!

Here is some of the data I'm using:

structure(list(yrbrn = structure(c(1965, 1952, 1952, 1969, 1980, 
1975, 1989, 2000, 2005, 1963, 2001, 1985, 2002, 1956, 1999, 1997, 
1953, 1991, 1993, 1966, 2004, 1977, 1964, 1991, 1970, 1990, 1946, 
1944, 1957, 2005, 1997, 1960, 1944, 1982, 1956, 1980, 1964, 1956, 
1957, 1957, 1949, 1997, 1948, -99, 2004, 1961, 1973, 1935, 1983, 
1964), label = "Year of birth", format.stata = "%10.0g", labels = c(`no answer` = -99, 
Refusal = NA, `Don't know` = NA, `No answer` = NA), class = c("haven_labelled", 
"vctrs_vctr", "double")), gndr = structure(c(1, -99, 1, 1, 1, 
1, 1, 2, 1, 2, 1, 2, 2, 2, 1, 1, 1, 2, 1, 1, 2, 2, 1, 2, 1, 1, 
1, 2, -99, 1, 1, 2, 1, 2, 2, 2, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1, 
2, 2, 2, 1), label = "Gender", format.stata = "%9.0g", labels = c(`no answer` = -99, 
Male = 1, Female = 2, `No answer` = NA), class = c("haven_labelled", 
"vctrs_vctr", "double"))), row.names = c(NA, -50L), class = c("tbl_df", 
"tbl", "data.frame"))

Solution

  • It seems that ifelse from base is not able to keep the labels of a labelled column. You can use if_else from dplyr instead:

    df %>%
      mutate(across(everything(), ~ if_else(. == -99, NA, .)))
    
    # # A tibble: 50 × 2
    #        yrbrn        gndr
    #    <dbl+lbl>   <dbl+lbl>
    #  1      1965  1 [Male]  
    #  2      1952 NA         
    #  3      1952  1 [Male]  
    #  4      1969  1 [Male]  
    #  5      1980  1 [Male]  
    #  6      1975  1 [Male]  
    #  7      1989  1 [Male]  
    #  8      2000  2 [Female]
    #  9      2005  1 [Male]  
    # 10      1963  2 [Female]
    # # … with 40 more rows
    

    You can also use replace:

    df %>%
      mutate(across(everything(), ~ replace(.x, .x == -99, NA)))