Is there a R package that give me summary statistics of a numeric variable which include the percentage of missing values?
I have tried the built-in summary, Hmisc describe and psych describe, but non of those do:
> x <- rnorm(1000, 5, 0.6)
> x[sample(seq(1,length(x)),100)]<-NA
> summary(x)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
3.12726377 4.62901915 5.02423569 5.02075611 5.39955795 7.52357325 100
> Hmisc::describe(x)
x
n missing distinct Info Mean pMedian Gmd .05
900 100 900 1 5.021 5.02 0.6623 4.053
.10 .25 .50 .75 .90 .95
4.267 4.629 5.024 5.400 5.809 6.000
lowest : 3.12726 3.27157 3.37501 3.41268 3.41959
highest: 6.41032 6.44724 6.50692 6.54191 7.52357
> psych::describe(x)
vars n mean sd median trimmed mad min max range skew kurtosis se
X1 1 900 5.02 0.59 5.02 5.02 0.57 3.13 7.52 4.4 0.01 0.17 0.02
Take a look at the skimr
R package.
x <- rnorm(1000, 5, 0.6)
x[sample(seq(1,length(x)),100)] <- NA
skimr::skim(x)
Name | x |
Number of rows | 1000 |
Number of columns | 1 |
_______________________ | |
Column type frequency: | |
numeric | 1 |
________________________ | |
Group variables | None |
Data summary
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
data | 100 | 0.9 | 5 | 0.61 | 3.03 | 4.6 | 4.98 | 5.41 | 6.78 | ▁▃▇▅▁ |
Created on 2024-12-21 with reprex v2.0.2