rjanitor

How to remove empty spaces and omit the rows that has those empty spaces in R


enter image description here

So I tried to use df <- remove_empty(df, which=c("rows","cols"), cutoff = 1). It did not remove any of it even though there are empty slots in the data. Not sure what I did wrong or what I need to do to change, to make it remove the row entirely.

enter image description here

The results remained the same, as the blank spaces are still there and have yet to be omitted.


Solution

  • Consider this example data.frame, which has empty strings (and some NA's).

    > dat
        V1   V2 V3 V4 V5
    1    A    A  1  6 11
    2    B    B  2  7 12
    3            3  8 13
    4            4  9 14
    5 <NA> <NA>  5 10 15
    

    First, you want to replace the empty strings with NA's and assign it back to dat[],

    > dat[] <- lapply(dat, \(x) replace(x, x  %in% "", NA))
    > dat
        V1   V2 V3 V4 V5
    1    A    A  1  6 11
    2    B    B  2  7 12
    3 <NA> <NA>  3  8 13
    4 <NA> <NA>  4  9 14
    5 <NA> <NA>  5 10 15
    

    then subset the rows withcomplete.cases.

    > dat <- dat[complete.cases(dat), ]
    > dat
      V1 V2 V3 V4 V5
    1  A  A  1  6 11
    2  B  B  2  7 12
    

    In one step:

    > dat |> lapply(\(x) replace(x, x  %in% "", NA)) |> data.frame() |> {\(.) .[complete.cases(.), ]}()
      V1 V2 V3 V4 V5
    1  A  A  1  6 11
    2  B  B  2  7 12
    

    Data:

    dat <- structure(list(V1 = c("A", "B", "", "", NA), V2 = c("A", "B", 
    "", "", NA), V3 = c("1", "2", "3", "4", "5"), V4 = c("6", "7", 
    "8", "9", "10"), V5 = c("11", "12", "13", "14", "15")), class = "data.frame", row.names = c(NA, 
    -5L))