I have a file with large range of non-standardised mixed imperial and metric measurements, which I want to standardise and republish.
A sample of that range looks like this:
df <- data.frame(Measurements =c("1.25m", "2 Feet", "3 Inches", "5.5 cm"))
|Measurements|
|1.25m |
|2 Feet |
|3 Inches |
|5.5 cm |
which I want to look like this:
|Measurements|MM_Conversion|
|1.25m |1200mm
|2 Feet |609.6mm
|3 Inches |76.2mm
|5.5 cm |55mm
I can't use measurements::conv_unit
or units::set_unit
because they both seem to require numeric input values. Is there a straightforward way of doing this which can parse both the value and the string, and convert accordingly?
EDIT 1: Having an issue whereby Conv_Unit can't convert NA values. If the initial vector instead was: df <- data.frame(Measurements =c(NA, 1.25m", "2 Feet", "3 Inches", "5.5 cm"))
, how would you get around it?
We can use extract
from tidyr
to separate the value and unit and feed that into conv_unit
using map2
:
df <- data.frame(Measurements =c(NA, "1.25m", "2 Feet", "3 Inches", "5.5 cm"))
library(tidyverse)
library(stringr)
library(measurements)
df %>%
extract(Measurements, c("value", "unit"),
regex = "^([\\d.]+)\\s*([[:alpha:]]+)$",
remove = FALSE, convert = TRUE) %>%
mutate(unit = str_replace_all(unit, c(Feet="ft", Inches="inch")),
MM_Conversion = paste0(map2(value, unit, ~if(!is.na(.x)) conv_unit(.x, .y, "mm") else NA), "mm"))
Result:
Measurements value unit MM_Conversion
1 <NA> NA <NA> NAmm
2 1.25m 1.25 m 1250mm
3 2 Feet 2.00 ft 609.6mm
4 3 Inches 3.00 inch 76.2mm
5 5.5 cm 5.50 cm 55mm
or use filter
if NA
s should not appear in the final output:
df %>%
extract(Measurements, c("value", "unit"),
regex = "^([\\d.]+)\\s*([[:alpha:]]+)$",
remove = FALSE, convert = TRUE) %>%
filter(!is.na(Measurements)) %>%
mutate(unit = str_replace_all(unit, c(Feet="ft", Inches="inch")),
MM_Conversion = paste0(map2(value, unit, ~conv_unit(.x, .y, "mm")), "mm"))
Result:
Measurements value unit MM_Conversion
1 1.25m 1.25 m 1250mm
2 2 Feet 2.00 ft 609.6mm
3 3 Inches 3.00 inch 76.2mm
4 5.5 cm 5.50 cm 55mm
Notice how I manually abbreviated the original units to make conv_unit
work. It would be one step less if the original units were already in abbreviated form.