I have trouble finding ways to import files efficiently that are formatted European style (a ,
comma for decimals and .
dot for thousands 12.300,35
) from google sheets via the library(googlesheets)
package. R will automatically read the commas as delimiters and not as decimal signs.
When downloading the sheet (via R), it automatically chooses a comma separator. Can I customize that?
Here's my data via a shared google sheets link.
My code:
library(googlesheets)
# Authenticate with your google sheets
sheets <- gs_ls()
# Import
spreadsheet <- gs_title("sample_data")
# Read dataset
sample <- gs_read(spreadsheet, ws = 1)
With the wrong results:
jaren hbo_procent wo_procent
<int> <dbl> <dbl>
1 2006 66.0 9.00
2 2007 67.0 97.0
3 2008 7.00 104
4 2009 73.0 112
5 2010 75.0 119
6 2011 77.0 128
7 2012 78.0 137
8 2013 76.0 142
9 2014 74.0 148
10 2015 75.0 163
11 2016 75.0 181
12 2017 78.0 195
googlesheets has an interface like readr and specifically it offers the locale
argument. You can read more about it here: http://readr.tidyverse.org/articles/locales.html
In this case, we need to specify the decimal mark.
I made my own copy of OP's Sheet. I'm not sure the original is correctly published to the web, which is necessary to make something world readable via the Sheets v3 API. This is (confusingly) different from making a Sheet "public on the web".
library(googlesheets)
## I made a copy of OP's original Sheet
spreadsheet <- gs_title("Copy of sample_data")
#> Sheet successfully identified: "Copy of sample_data"
# Read dataset
sample <- gs_read(
spreadsheet,
ws = 1,
locale = readr::locale(decimal_mark = ",")
)
#> Accessing worksheet titled 'Blad1'.
#> Parsed with column specification:
#> cols(
#> jaren = col_double(),
#> hbo_procent = col_double(),
#> wo_procent = col_double()
#> )
sample
#> # A tibble: 12 x 3
#> jaren hbo_procent wo_procent
#> <dbl> <dbl> <dbl>
#> 1 2006. 6.60 9.00
#> 2 2007. 6.70 9.70
#> 3 2008. 7.00 10.4
#> 4 2009. 7.30 11.2
#> 5 2010. 7.50 11.9
#> 6 2011. 7.70 12.8
#> 7 2012. 7.80 13.7
#> 8 2013. 7.60 14.2
#> 9 2014. 7.40 14.8
#> 10 2015. 7.50 16.3
#> 11 2016. 7.50 18.1
#> 12 2017. 7.80 19.5
Alternatively, to make a Sheet that is easier for more people to "just read", you could do File > Spreadsheet settings and choose, e.g., the United States as the locale for the Sheet itself.