rdata.tabletidyversevroom

Define decimal separator with vroom


I often face csv files, which were saved with a German locale and are therefore not properly comma-separated, but rather are separated with a semi-colon. This is of course easily solvable by defining the separator. But vroom in contrast to for example fread does not offer the possibility to also define the decimal separator. Therefore, numerical values with a , as decimal separator are imported as characters or wrongly without any decimal separator and thus really large numbers. Is there a way to directly define the decimal separator similar to the way it works in fread?

library(vroom)
library(data.table)
   
df <- data.table(row.num = 1:10
                 , V1 = rnorm(10,10,5)
                 , V2 = rnorm(10,100,30))

fwrite(df, file = "vroom_test.csv", sep = ";", dec = ",")

fread(input = "vroom_test.csv", sep = ";", dec = ",")

vroom(file = "vroom_test.csv", delim = ";")
# definition of custom locale does allow that
vroom(file = "vroom_test.csv", delim = ";", locale = locale(grouping_mark = ".", decimal_mark = ",", encoding = "UTF-8"))

Solution

  • As already mentioned in the comments, the solution is rather straight-forward and the only thing necessary is to include the locale() option to the vroom call. Possible options to the locale option can be found in its documentation.

    vroom(file = "vroom_test.csv", delim = ";", locale = locale(grouping_mark = ".", decimal_mark = ",", encoding = "UTF-8"))