In R, I used janitor::tabyl
to produce a frequency table of my factor variable Mp
.
data_mp <- janitor::tabyl(data, Mp, show_na = TRUE)
It gaves me:
Mp | n | percent | valid_percent |
---|---|---|---|
FCA | 4848 | 5.66% | 6.38% |
FCA-TESLA | 6629 | 7.74% | 8.72% |
FCA ITALY SPA | 8700 | 10.16% | 11.44% |
FIAT GROUP AUTOMOBILES SPA | 451 | 0.53% | 0.59% |
FORD-VOLVO | 2780 | 3.25% | 3.66% |
HYUNDAI | 4609 | 5.38% | 6.06% |
MERCEDES-BENZ | 7366 | 8.60% | 9.69% |
TATA MOTORS JAGUAR LAND ROVER | 4832 | 5.64% | 6.36% |
VW-SAIC | 9289 | 10.85% | 12.22% |
VW GROUP PC | 26526 | 30.98% | 34.89% |
NA | 9606 | 11.22% | NA |
Now I would like to use the number of NA
observations and its percentage.
I tried with data_mp["NA", "n"]
but it returns me NA
.
How do you return the values in the NA
row of a factor column?
I managed to recover the number of observations by doing: data_mp[is.na(data_mp$Mp),"n"]
.
But I'm not sure this is the right way.
It is not a row name, but the value in column 'Mp', so we extract the column 'Mp' with $
or [[
, apply is.na
(to check for the NA row instead of == "NA"
), use that as row index
data_mp[is.na(data_mp$Mp), "n"]
[1] 31
data <- data.frame(Mp = sample(c(LETTERS[1:10], NA), 300, replace = TRUE))
data_mp <- janitor::tabyl(data, Mp, show_na = TRUE)