For example I have a tibble like this. test <- tibble(a = 10, b = "a")
with this input, I want a function that can return "dc" which represent double and character.
The reason I ask this is that I want to read in lots of files. and I don't want to let read_table function to decide the type for each columns. I can specific the string manually, but since the actually data I want to import have 50 columns, it is quite hard to do manually.
Thanks.
While the aforementioned test %>% summarise_all(class)
will give you the class names of the columns it does so in a long form, whereas in this problem you to convert them to single character codes that mean something to read_table
col_types
. To map from class names to single letter codes you can use a lookup table, here's an (incomplete) example with dput
:
structure(list(col_type = c("character", "integer", "numeric",
"double", "logical"), code = c("c", "i", "n", "d", "l")), .Names = c("col_type",
"code"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-5L))
Now using this table, I'll call it types
, we can finally transform the column types in a single string:
library(dplyr)
library(tidyr)
library(stringr)
test %>%
summarise_all(class) %>%
gather(col_name, col_type) %>%
left_join(types) %>%
summarise(col_types = str_c(code, collapse = "")) %>%
unlist(use.names = FALSE)
This gets the class for each column (summarise_all
) then gathers them into a tibble matching the column name with the column type (gather
). The left_join
matches on the col_type
column and gives the short 1-char code for each column name. Now we don't do anything with the column names, so it's fine to just concatenate with a summarise
and str_c
. Finally unlist
pulls the string out of a tibble.