I am trying to read this googledoc via googlesheets4
.
The file contains 3 columns:
id
, as characterstart_date
/ end_date
: this columns contain both dates (in the format 2020-01-31
) and text strings (namely strings being in_stock
and in_use
)I made this googlesheet available to read with anybody with the link.
I read the dataframes as follows:
suppressPackageStartupMessages({
library(tidyverse)
library(lubridate)
library(readxl)
library(googlesheets4)})
df <- read_sheet('1rS_Zw4Qc8S4oV5-u_YZJACNBqu3VDQUMf9N_dGrLUo8',sheet = 'test_import')
While it seems initially fine, when I check how it was imported, here's the result:
> head(df)
# A tibble: 6 x 3
id start_date end_date
<dbl> <list> <list>
1 1 <chr [1]> <chr [1]>
2 2 <chr [1]> <chr [1]>
3 3 <chr [1]> <chr [1]>
4 4 <chr [1]> <chr [1]>
5 5 <chr [1]> <chr [1]>
6 6 <chr [1]> <chr [1]>
While RStudio shows correctly the dataframe in the preview, the columns start_date
and end_date
are loaded as lists (which never happened to me).
When I try to convert the columns into dates by doing:
df %>%
mutate(start_date = as.Date(start_date))
I get the error:
Error in as.Date.default(start_date) :
do not know how to convert 'start_date' to class “Date”
How can I correctly import this while preserving start_date
and end_date
as dates columns in order to perform operations with them?
Thanks for the help.
> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Catalina 10.15.5
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] googlesheets4_0.2.0.9000 readxl_1.3.1 lubridate_1.7.4 forcats_0.4.0 stringr_1.4.0 dplyr_0.8.3
[7] purrr_0.3.4 readr_1.3.1 tidyr_1.0.0 tibble_3.0.1 ggplot2_3.2.1 tidyverse_1.2.1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.2 cellranger_1.1.0 pillar_1.4.4 compiler_3.6.1 tools_3.6.1 packrat_0.5.0 googledrive_1.0.1 jsonlite_1.6.1
[9] lifecycle_0.2.0 nlme_3.1-141 gtable_0.3.0 lattice_0.20-38 pkgconfig_2.0.3 rlang_0.4.6 cli_2.0.2 rstudioapi_0.10
[17] haven_2.1.1 withr_2.2.0 xml2_1.2.2 httr_1.4.1 generics_0.0.2 vctrs_0.3.1 hms_0.5.1 grid_3.6.1
[25] tidyselect_0.2.5 glue_1.4.1 R6_2.4.1 fansi_0.4.1 modelr_0.1.5 magrittr_1.5 backports_1.1.4 scales_1.0.0
[33] ellipsis_0.3.1 rvest_0.3.4 assertthat_0.2.1 colorspace_1.4-1 stringi_1.4.6 lazyeval_0.2.2 munsell_0.5.0 broom_0.5.2
[41] crayon_1.3.4
try to do it this way (dplyr 1.0.0).
You have the text in_stock
and in_use
in the date column. Therefore, an error appears.
library(tidyverse)
df <- read_sheet('1rS_Zw4Qc8S4oV5-u_YZJACNBqu3VDQUMf9N_dGrLUo8',sheet = 'test_import', col_types = "c")
df %>%
mutate(across(ends_with("_date"), ymd))
or old dplyr
df %>%
mutate_at(vars(ends_with("_date")), ymd)