rdataframereplace

How to replace values in all columns in r


Just a quick question: how can I replace some values with others if these values are present in all the dataframe's column? Functions like mapvalues and recode work only if the column is specified, but in my case the dataframe has 89 columns so that would be time-consuming.

For the sake of clarity, take in consideration the following example. I want to replace [NULL] with another value.

Example:

a <- c("NULL",2,"NULL")
b <- c(3, "NULL", 1)

df <- data.frame(a, b)
df

           a         b
0      NULL          3
1          2      NULL 
2      NULL          1

The difference between the example and my case is that the dataset is [35383 x 89], and the values I want to replace are more than one.

Thank you in advance for your time.


Solution

  • An extension to the comment by Ronak Shah. You can add 0 if you want like that. Or you can replace it with desired values, if you like that.

    For example, replace the NULLs with mean of the respective columns:

    #Run a loop to convert the characters into numbers because for your case it is all characters
    #This will change the NULL to NAs.
    
    for (i in colnames(df)){
      df[,i] <- as.numeric(df[,i])
    }
    
    #Now replace the NAs with the mean of the column
    
    for (i in colnames(df)){
      df[,i][is.na(df[,i])] <- mean(df[,i], na.rm=TRUE)
    }
    
    

    You can similarly do this for median also. Let me know in the comment if you have any doubts.