rlabelhmisc

How to use variable labels in an R data frame


I am trying to assign and then use variable labels and then give my work to a novice R programmer who is experienced with SPSS. When the other programmer uses the data, she will want to make tables. She might not remember what h1 is but will know what "Heights in feet" is.

I have assigned the labels. Now how do I use them?

Clarification: Once I have the labels, I want to use the labels the way I would use column names. So in RStudio if I type "heights$", I want to see "Heights in feet" as an option. But I do not want to lose the column name.

library(Hmisc) # variable labels
heights = data.frame(h1 = c(4,5,6, 4), h2 = c(48, 60, 72, 48))
label(heights$h1) = "Heights in feet"
label(heights$h2) = "Heights in inches"
heights

table(heights[[`Heights in feet`]]) # Not correct
table(heights[`Heights in feet`]) # Not correct
table(heights$`Heights in feet`) # Not correct

Ideas much appreciated.


Solution

  • Unfortunately the labels are not supported with basic indexing operations. The closest basic subset strategy most similar to what you have is

    table(heights[, label(heights)=="Heights in feet"])
    

    If this a common operation, you could redefine some operator to overload that type of thing for a data.frame. For example

    `%%.data.frame` <- function(x, lbl) {
      x[,label(x)==lbl]
    }
    
    table(heights%%"Heights in feet")
    

    You could even make an assignment version

    `%%<-` <- function(x, ...)  UseMethod("%%<-")
    `%%<-.data.frame` <- function(x, lbl, value) {
      x[,label(x)==lbl] <- value
      x
    }
    heights%%"Heights in feet" <- heights%%"Heights in feet"+1
    

    Of course this is very non-standard so I probably wouldn't recommend, but just pointing out the possibility.