rstringprefixsuffixr-rownames

Add a prefix and a suffix only to some of the rownames


I have this dataframe:

structure(list(Treatnent.state = c("PRE Immune Checkpoint Blockade Therapy", 
"PRE Immune Checkpoint Blockade Therapy", "PRE Immune Checkpoint Blockade Therapy", 
"PRE Immune Checkpoint Blockade Therapy", "PRE Immune Checkpoint Blockade Therapy", 
"PRE Immune Checkpoint Blockade Therapy", "PRE Immune Checkpoint Blockade Therapy (On dabrafenib+trametinib)", 
"PRE Immune Checkpoint Blockade Therapy", "PRE Immune Checkpoint Blockade Therapy", 
"PRE Immune Checkpoint Blockade Therapy", "PRE Immune Checkpoint Blockade Therapy", 
"PRE Immune Checkpoint Blockade Therapy", "PRE Immune Checkpoint Blockade Therapy (On dabrafenib+trametinib)", 
"PRE Immune Checkpoint Blockade Therapy (On dabrafenib+trametinib)"
), timepoint = c(-6, 0, 0, 0, 0, -1, 0, -3, -2, 0, 0, -1, 0, 
0), Patient = c(115, 148, 208, 208, 272, 39, 42, 422, 62, 208, 
208, 39, 42, 42)), class = "data.frame", row.names = c("115-031814          ", 
"148-6-5-14_S9       ", "208-3-11-15_S13     ", "208-9-10-14_S11     ", 
"272-121914          ", "39-3-31-14_S15      ", "42-10-17-14_S3      ", 
"422-092815          ", "62-10-2-13_S6       ", "MGH208_031115-1.bam ", 
"MGH208_031115-2.bam ", "MGH39_033114.bam    ", "MGH42_101714.bam    ", 
"MGH42_101714_1.bam  "))

with rownames:

 [1] "115-031814          " "148-6-5-14_S9       " "208-3-11-15_S13     " "208-9-10-14_S11     "
 [5] "272-121914          " "39-3-31-14_S15      " "42-10-17-14_S3      " "422-092815          "
 [9] "62-10-2-13_S6       " "MGH208_031115-1.bam " "MGH208_031115-2.bam " "MGH39_033114.bam    "
[13] "MGH42_101714.bam    " "MGH42_101714_1.bam  "

I want to add a prefix "X" and suffix ".bam", only for the rownames that don't start with MGH.

So for example: The rowname of the first row, 115-031814, would become X115-031814.bam, and the rowname MGH208_031115-1.bam would not change at all.


Solution

  • Use grepl to check whether a string starts with 'MGH', then ifelse to apply paste "X" and ".bam" if it does not start with 'MGH'. I used trimws because some of your rownames has whitespace.

    ifelse(!grepl("^MGH" , rownames(df)),
           paste0("X", trimws(rownames(df)), ".bam"),
           trimws(rownames(df)))
    

    output

     [1] "X115-031814.bam"      "X148-6-5-14_S9.bam"   "X208-3-11-15_S13.bam"
     [4] "X208-9-10-14_S11.bam" "X272-121914.bam"      "X39-3-31-14_S15.bam" 
     [7] "X42-10-17-14_S3.bam"  "X422-092815.bam"      "X62-10-2-13_S6.bam"  
    [10] "MGH208_031115-1.bam"  "MGH208_031115-2.bam"  "MGH39_033114.bam"    
    [13] "MGH42_101714.bam"     "MGH42_101714_1.bam"