I am starting to use readr
to import CSV files with read_csv
...how do I deal with CSV files containing spaces in the header names?
read_csv
imports them with the spaces (and special characters) which prevents me from going straight to mutate
and other dplyr
functions.
How do I handle this?
Thanks!
You could use make.names
after you read in the data.
df <- data.frame(x=NA)
colnames(df) <- c("This col name has spaces")
colnames(df) <- make.names(colnames(df), unique=TRUE)
It will return column names with periods rather than spaces as separators.
colnames(df)
[1] "This.col.name.has.spaces"
According to the help page make.names
takes a character vector and returns a:
A syntactically valid name consisting of letters, numbers and the dot or underline characters and starts with a letter or the dot not followed by a number
EDIT: Including an example with special characters.
df <- data.frame(x=NA)
colnames(df) <- c("Higher than 80(°F)")
colnames(df) <- make.names(colnames(df), unique=TRUE)
colnames(df)
[1] "Higher.than.80..F."
As you can see make.names
takes 'illegal' characters and replaces them with periods, to prevent any syntax errors/issues when calling an object name directly.
If you want to remove repeating .
's then add-
colnames(df) <- gsub('(\\.)\\1+', '\\1', colnames(df))
colnames(df)
[1] "Higher.than.80.F."