In haven
docs, I've seen examples of how zap_labels()
will strip value labels from variables. In each case in the docs, the variable used in the example was created using an R
assignment operator (<-
) to directly create a vector (e.g. image below, via: https://haven.tidyverse.org/reference/zap_labels.html ).
However, I'm trying to use zap_labels()
on data I've imported using haven's read_sav()
, and it doesn't seem to be working as I expect.
Code: (I'm on Windows 10):
I import a .sav
file using haven
like so:
June18 <- read_sav("C:/ ... filename.sav",
user_na = FALSE) %>%
as_factor()
The variable I'm exploring is V1Q1_W35
Attributes:
attributes(June18$V1Q1_W35)
Output:
$levels [1] "Very fair" "Somewhat fair" "Not very fair" "Not fair at all" "Refused"
In the original .sav
file, value label mappings for V1Q1_W35
look like this:
So, per my understanding, if I do zap_labels()
to V1Q1_W35
, I should see the original numbers in the data like 1
, 2
, 3
, 4
and 99
.
However, when I do the following, I still see value labels.
attributes(zap_labels(June18$V1Q1_W35))
Output:
$levels [1] "Very fair" "Somewhat fair" "Not very fair" "Not fair at all" "Refused"
So my question is: in this situation (trying to see the different levels), what should I be doing to see the original numbers in the data instead of the value labels they are mapped to?
This is because ehen importing your data, you convert it to a factor, which in this case just keeps the labels and gets rid of the numbers.
So you could either don‘t use the as_factor
command when reading in your data before applying the zap_labels
command or you can directly convert your variables to numeric during import by using as.numeric
. You can of course also choose to apply this only to a subset of columns where tjis makes sense.