rdata.tablelabelhmisc

Getting the labels of the column names (hmisc) and applying them to another dataframe


I have a question similar to this one.

I have a dataframe with labels which I can get out as follows:

library(data.table)
library(haven)
library(Hmisc)
library(foreign)
df <- fread("A B C
              1 2 3")
label(df$A) <- "Variable label for variable A" 
label(df$B) <- "Variable label for variable B" 
label(df$C) <- "Variable label for variable C" 

Lbl <- setNames(stack(lapply(df, label))[2:1], c("Varcode", "Variables"))

# Remove labels
df <- df |>
  zap_labels() |>
  zap_formats()

I can put them back as follows:

# set labels
for (i in seq_len(nrow(Lbl))) {
  Hmisc::label(df2[[Lbl$Varcode[i]]]) <- Lbl$Variables[i]
}

This works well for datasets with equal columns. However in my scenario, the labels need to be added back to a data frame that no longer has the same amount of columns.

df2 <- fread("A B D C E
              1 2 2 3 1")

How do I put back the labels in this case. The original solution will just do the following:

# set labels
for (i in seq_len(nrow(Lbl))) {
  Hmisc::label(df2[[Lbl$Varcode[i]]]) <- Lbl$Variables[i]
}

enter image description here

I assume I will have to rewrite the previous code into a matching function, that only replaces if there is a match.

How should I match the labels with the new data frame if they exist?


Solution

  • This is not super pretty, but you can use match to look up the variables that have labels and then only update those

    matches <- match(names(df2), Lbl$Varcode) 
    for (i in seq_along(matches)) {
      if (!is.na(matches[i]))
        label(df2[[i]]) <- Lbl$Variables[matches[i]]
    }