runits-of-measurementexponent

Converting unit abbreviations to numbers


I have a dataset that abbreviates numerical values in a column. For example, 12M mean 12 million, 1.2k means 1,200. M and k are the only abbreviations. How can I write code that allows R to sort these values from lowest to highest?

I've though about using gsub to convert M to 000,000 etc but that does not take into account the decimals (1.5M would then be 1.5000000).


Solution

  •     > 10 ** (3*as.integer(regexpr('T', 'KMGTPEY')))
        [1] 1e+12
    

    Then just multiply that power-of-ten by the decimal value you have.

        > unit_to_power <- function(u) {
            exp_ <- 10**(as.integer(regexpr(u, 'KMGTPEY')) *3)
            return (if(exp_>=0) exp_ else 1)
        }