rdataframer-rownames

How to make name a row name in R dataframe without modifications


I am making one of the column names to be a row name in data frame in R. However, whenever I am making the same, X is adding up before the numbers, hyphens (-) are replaced by dots (.) Is there a way to handle this in R, to keep the name as it is without any change. These type of changes are creating issues while merging two dataframes. Please assist me with this.

Here is the example of the code and dataframe:

Before

dput(tail(Probe.annotation.table, 10))
structure(list(ID = c("100306034_TGI_at", "100311622_TGI_at", 
"merck2-rsta-239398_at", "merck2-NM_000839-381.A_at", "merck2-NM_004103-360.B_at", 
"merck2-HCV_at", "merck2-HCV_revcomp_at", "merck2-HIV_at", "merck2-HIV_revcomp_at", 
"merck2-flu_at"), EntrezGeneID = c(NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_
), GeneSymbol = c("", "", "", "", "", "", "", "", "", ""), GB_ACC = c("", 
"", "", "", "", "", "", "", "", ""), miRNA_ID = c("", "", "", 
"", "", "", "", "", "", ""), ORF = c("", "", "", "", "", "", 
"", "", "", ""), PROBE_DERIVED_FROM_TRANSCRIPT = c("", "", "", 
"", "", "", "", "", "", ""), SPOT_ID = c("RosettaGeneID:BX647964", 
"RosettaGeneID:AK124699", "RosettaGeneID:rsta-239398", "RosettaGeneID:NM_000839-381.A", 
"RosettaGeneID:NM_004103-360.B", "RosettaGeneID:HCV", "RosettaGeneID:HCV_revcomp", 
"RosettaGeneID:HIV", "RosettaGeneID:HIV_revcomp", "RosettaGeneID:flu"
), RosettaGeneModelID = c("", "", "", "", "", "", "", "", "", 
"")), row.names = 52369:52378, class = "data.frame")

Code

rownames(Probe.annotation.table) <- make.names(Probe.annotation.table$ID, unique=TRUE)

After

put(tail(Probe.annotation.table, 10))
structure(list(ID = c("100306034_TGI_at", "100311622_TGI_at", 
"merck2-rsta-239398_at", "merck2-NM_000839-381.A_at", "merck2-NM_004103-360.B_at", 
"merck2-HCV_at", "merck2-HCV_revcomp_at", "merck2-HIV_at", "merck2-HIV_revcomp_at", 
"merck2-flu_at"), EntrezGeneID = c(NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_
), GeneSymbol = c("", "", "", "", "", "", "", "", "", ""), GB_ACC = c("", 
"", "", "", "", "", "", "", "", ""), miRNA_ID = c("", "", "", 
"", "", "", "", "", "", ""), ORF = c("", "", "", "", "", "", 
"", "", "", ""), PROBE_DERIVED_FROM_TRANSCRIPT = c("", "", "", 
"", "", "", "", "", "", ""), SPOT_ID = c("RosettaGeneID:BX647964", 
"RosettaGeneID:AK124699", "RosettaGeneID:rsta-239398", "RosettaGeneID:NM_000839-381.A", 
"RosettaGeneID:NM_004103-360.B", "RosettaGeneID:HCV", "RosettaGeneID:HCV_revcomp", 
"RosettaGeneID:HIV", "RosettaGeneID:HIV_revcomp", "RosettaGeneID:flu"
), RosettaGeneModelID = c("", "", "", "", "", "", "", "", "", 
"")), row.names = c("X100306034_TGI_at", "X100311622_TGI_at", 
"merck2.rsta.239398_at", "merck2.NM_000839.381.A_at", "merck2.NM_004103.360.B_at", 
"merck2.HCV_at", "merck2.HCV_revcomp_at", "merck2.HIV_at", "merck2.HIV_revcomp_at", 
"merck2.flu_at"), class = "data.frame")

Solution

  • Try this:

    library(dplyr)
    newdata <- Probe.annotation.table %>%
      as_tibble() %>%
      column_to_rownames(var = "ID")