rstring-substitution

Removing commas from strings and numbers


Using R, how do I remove commas if its a number and replace commas with a space if its a letter?:

  Company        | Sales   |
  -------------------------
  go, go, llc    |2,550.40 |
  tires & more   |  500    |
  l-m tech       |1,000.67 |

Sample data:

data = matrix(c('go, go,llc', 'tires & more', 'l-m technology',
 formatC(2550.40, format="f", big.mark=",", digits=2), 500, 
 formatC(1000.67, format="f", big.mark=",", digits=2)), 
 nrow=3, 
 ncol=2)

Expected output:

  Company      | Sales  |
  -----------------------
  go go llc    |2550.40 |
  tires & more |  500   |
  l-m tech     |1000.67 |

What I've tried:

data <- sapply(data, function(x){
           if (grepl("[[:punct:]]",x)){
              if (grepl("[[:digit:]]",x)){
                 x <- gsub(",","",x)
              }
              else{
                 x <- gsub(","," ",x)
              }
           }
        })

print(nrow(data)) # returns NULL

Solution

  • You can do this easily with a nested gsub:

    gsub(",", "", gsub("([a-zA-Z]),", "\\1 ", input)
    

    The inner pattern matches a letter followed by a comma and replaces it with just the letter. The outer gsub replaces any remaining commas with spaces.

    Applying it to your matrix:

        apply(data, 2, function(x) gsub(",", "", gsub("([a-zA-Z]),", "\\1 ", x)))
        #      [,1]             [,2]     
        # [1,] "go  go llc"     "2550.40"
        # [2,] "tires & more"   "500"    
        # [3,] "l-m technology" "1000.67"