rdataframeexcel-import

R. Problem removing "#NAME?" (from an excel import) in dataframe


I have a .csv import from excel that has formula hangups that I am trying to remove. A simple version of the data is below.

library(tidyverse)
df <- data.frame(
  species = letters[1:5],
  param1 = c("Place", "creek", "river", "#VALUE!", "desert"),
  param2 = c(-23.8, 43.23, "#NAME?", 45, 0.23),
  param3 = c(2.4, 2, 5.7, 0.00003, -2.5),
  stringsAsFactors = FALSE
) # This is a simplified version of the excel .csv import

df[df == "#VALUE!"] <- ""     # Removes excel cells where the formula left "#VALUE!"
df[df == "#NAME\\?"] <- ""   # This does not work

ndf <- df  # This is an attempt to reassign the columns to numeric
ndf
class(ndf$param2)
class(ndf$param3)

The main problem is that the data column Param2 with this left in it is assigned to character when it needs to be numeric, or the functions I have to run on it do not work.

I've tried many different things, however I always nothing seems to recognise the cell. How do I remove "#NAME?" across the df please?


Solution

  • You are doing an exact match (and not a regex match) so you don't need to escape special variables (like ?, !) differently. Try :

    df[df == "#VALUE!"] <- ""  
    df[df == "#NAME?"] <- NA
    df <- type.convert(df, as.is = TRUE)
    df
    #  species param1 param2   param3
    #1       a  Place -23.80  2.40000
    #2       b  creek  43.23  2.00000
    #3       c  river     NA  5.70000
    #4       d         45.00  0.00003
    #5       e desert   0.23 -2.50000
    
    str(df)
    #'data.frame':  5 obs. of  4 variables:
    # $ species: chr  "a" "b" "c" "d" ...
    # $ param1 : chr  "Place" "creek" "river" "" ...
    # $ param2 : num  -23.8 43.23 NA 45 0.23
    # $ param3 : num  2.4 2 5.7 0.00003 -2.5