rdataframecorrelationr-corrplot

How can I get a working corrplot for my data?


I am trying to get the corrplot for my data variables which are a combination of binary, continuous and categorical variables. However, when I run this code, it keeps giving me errors. The error when i load my data frame, called df2, is: Error in corrplot(df2) : The matrix is not in [-1, 1]!. How can I solve this?

When I compute the correlation I also get that for certain variables, I only receive NA's, even though they are numeric and integer values 1.

Attached an example of my data variables, where hh_code is the column used for identification: 2

How can I get the correlation between variables for my data in R? Thanks!


Solution

  • If you have a table in which some columns are numeric and others are categorical, you can use the function GGally::ggpairs to get an overview about the associations between these variables:

    library(GGally)
    #> Loading required package: ggplot2
    #> Registered S3 method overwritten by 'GGally':
    #>   method from   
    #>   +.gg   ggplot2
    data <- ggplot2::mpg[c(1,3,4,7,8)]
    ggpairs(data)
    #> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
    #> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
    #> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
    #> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
    #> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
    #> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
    

    Created on 2022-05-17 by the reprex package (v2.0.0)

    If you need a little bit more rigour, you can use statistical tests to get significant relationships between columns / variables (if their assumptions hold):

    covariate x outcome y test R function
    numeric numeric Person correlation cor.test(x,y, method = "pearson")
    binary numeric t test t.test(x, y)
    ordinal (ordered factor) ordinal (ordered factor) Spearman correlation cor.test(as.numeric(x), as.numeric(y), method="spearman")
    categorial (many levels) numeric ANOVA anova(lm(y ~ x))