rreadr

Import CSV file with spaces in header using read_csv from readr


I am starting to use readr to import CSV files with read_csv...how do I deal with CSV files containing spaces in the header names?

read_csv imports them with the spaces (and special characters) which prevents me from going straight to mutate and other dplyr functions.

How do I handle this?

Thanks!


Solution

  • You could use make.names after you read in the data.

    df <- data.frame(x=NA)
    colnames(df) <- c("This col name has spaces")
    colnames(df) <- make.names(colnames(df), unique=TRUE)
    

    It will return column names with periods rather than spaces as separators.

    colnames(df)
    [1] "This.col.name.has.spaces"
    

    According to the help page make.names takes a character vector and returns a:

    A syntactically valid name consisting of letters, numbers and the dot or underline characters and starts with a letter or the dot not followed by a number

    EDIT: Including an example with special characters.

    df <- data.frame(x=NA)
    colnames(df) <- c("Higher than 80(°F)")
    colnames(df) <- make.names(colnames(df), unique=TRUE)
    
    colnames(df)
    [1] "Higher.than.80..F."
    

    As you can see make.names takes 'illegal' characters and replaces them with periods, to prevent any syntax errors/issues when calling an object name directly.

    If you want to remove repeating .'s then add-

    colnames(df) <- gsub('(\\.)\\1+', '\\1', colnames(df))
    colnames(df)
    [1] "Higher.than.80.F."