I have this data set, I put a screenshot of real data instead of a code or something. sorry for messing up, I am a newbie here in R enter image description here
Then, I want to change the data into dummy set for "13 Source" categorical data, but it has to be summarized by "HH No". Which will look like this enter image description here I've tried to use to.dummy by varhandle, model.matrix but ended up messy dataset. Could anybody help me how to deal with this? Thanks a million in advance
There are a number of ways to make dummy variables from factors - here is one way to create a summary presence table.
Assume df
is your data frame. You can use xtabs
to start with, which will create a frequency table from your 2 columns.
By comparing to see if your values are > 0, you will get TRUE
if > 0, and FALSE
otherwise. Adding 0 at the end will make TRUE
the number 1 and FALSE
the number 0.
(xtabs(~ HH_No + Source, df) > 0) + 0
Output
Source
HH_No Deep_well Rainwater
1 1 1
3 1 1
4 0 1
Data
df <- structure(list(HH_No = c(1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 3, 3,
3, 3, 4, 4), Source = structure(c(2L, 2L, 2L, 2L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L), .Label = c("Deep_well",
"Rainwater"), class = "factor")), class = "data.frame", row.names = c(NA,
-16L))