rdataframedput

Cannot manage to remove $ pattern in dataframe


I have a column in my dataset which looks like that (not the exact numbers) :

Cost
50
75
$ 1,789,456
$ 1,200,923
690.3490200

The type of this column is character.

In order to do my computations i want to remove the "," the "$" and convert the column to numeric format.

df$cost<-gsub(",","",as.character(df$cost))

This one worked, i have now 1789456 instead of 1,789,456 etc. However, the code for the $ don't work

df$cost<-gsub("$","",as.character(df$cost))

df$cost<-gsub("$ ","",as.character(df$cost))

No error message but here's the output :

Cost
50
75
$ 1789456
$ 1200923
690.3490200

Here's what the dput() gives me :

structure(list(head.df.cost..31. = structure(c(NA, 
NA, NA, NA, NA, NA, NA, NA, 15L, 14L, 14L, 14L, 14L, 14L, 13L, 
4L, 1L, 9L, 12L, 8L, 7L, 10L, 10L, 7L, 2L, 5L, 6L, 6L, 3L, 11L
), .Label = c("$ 1062498", "115.11", "236.49", "275.87", "30", 
"40", "49", "50", "575.64", "60", "631.19200000000001", "75", 
"SPONSORED", "$ 2542196"
"ND", "USD 2300"), class = "factor")), class = "data.frame", row.names = c(NA, 
-30L))

Solution

  • $ represents the end of a line in regex. You need to escape it to use it as a literal. I'm not at a computer, but this should get you want you're looking for:

    gsub("[ ,$]+", "", df$cost, perl = TRUE)
    

    This should replace any one or more comma, space, or $. You don't have to escape $ explicitly in square brackets. If you wanted to just replace $s, you could use the pattern "\\$".