I'm fairly new to R and trying to recode missing values (stored as -99
) to NA
. For some reasons this removes all my variable labels from the data frame.
My code is the following
df <- df %>%
mutate(across(everything(), ~ ifelse(. == -99, NA, .)))
Is their any way to work around this or possibly use another command? Thank you very much in advance!
Here is some of the data I'm using:
structure(list(yrbrn = structure(c(1965, 1952, 1952, 1969, 1980,
1975, 1989, 2000, 2005, 1963, 2001, 1985, 2002, 1956, 1999, 1997,
1953, 1991, 1993, 1966, 2004, 1977, 1964, 1991, 1970, 1990, 1946,
1944, 1957, 2005, 1997, 1960, 1944, 1982, 1956, 1980, 1964, 1956,
1957, 1957, 1949, 1997, 1948, -99, 2004, 1961, 1973, 1935, 1983,
1964), label = "Year of birth", format.stata = "%10.0g", labels = c(`no answer` = -99,
Refusal = NA, `Don't know` = NA, `No answer` = NA), class = c("haven_labelled",
"vctrs_vctr", "double")), gndr = structure(c(1, -99, 1, 1, 1,
1, 1, 2, 1, 2, 1, 2, 2, 2, 1, 1, 1, 2, 1, 1, 2, 2, 1, 2, 1, 1,
1, 2, -99, 1, 1, 2, 1, 2, 2, 2, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1,
2, 2, 2, 1), label = "Gender", format.stata = "%9.0g", labels = c(`no answer` = -99,
Male = 1, Female = 2, `No answer` = NA), class = c("haven_labelled",
"vctrs_vctr", "double"))), row.names = c(NA, -50L), class = c("tbl_df",
"tbl", "data.frame"))
It seems that ifelse
from base
is not able to keep the labels of a labelled column. You can use if_else
from dplyr
instead:
df %>%
mutate(across(everything(), ~ if_else(. == -99, NA, .)))
# # A tibble: 50 × 2
# yrbrn gndr
# <dbl+lbl> <dbl+lbl>
# 1 1965 1 [Male]
# 2 1952 NA
# 3 1952 1 [Male]
# 4 1969 1 [Male]
# 5 1980 1 [Male]
# 6 1975 1 [Male]
# 7 1989 1 [Male]
# 8 2000 2 [Female]
# 9 2005 1 [Male]
# 10 1963 2 [Female]
# # … with 40 more rows
You can also use replace
:
df %>%
mutate(across(everything(), ~ replace(.x, .x == -99, NA)))