rdataframenamesillegal-characters

Specifying column names in a data.frame changes spaces to "."


Let's say I have a data.frame, like so:

x <- c(1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10)
df <- data.frame("Label 1"=x,"Label 2"=rnorm(100))

head(df,3)

returns:

  Label.1    Label.2
1       1  1.9825458
2       2 -0.4515584
3       3  0.6397516

How do I get R to stop automagically replacing the space with a period in the column name? ie, "Label 1" instead of "Label.1".


Solution

  • You don't.

    With the space you desire the format would not satisfy the requirements for an identifier that come to play when you use df$column.1 -- that could not cope with a space. So see the make.names() function for details or an example:

    > make.names(c("Foo Bar", "tic tac"))
    [1] "Foo.Bar" "tic.tac"  
    >                                              
    

    Edit eleven years later: The answer still stands that R prefers column names can be valid variable names. But R is flexible: if you insist you can use the other form _but then need to require the not-otherwise-valid-within-the-language column names explicitly:

    > x <- c(1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10)
    > df <- data.frame("Label 1"=x,"Label 2"=rnorm(100), check.names=FALSE)
    > summary( df$`Label 2` )
       Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    -2.2719 -0.7148 -0.0971 -0.0275  0.6559  2.5820 
    > 
    

    So by saying check.names=FALSE we override the default (and sensible) check, and by wrapping the identifier in backticks we can access the column.