rdataframer-rownames

how to edit row names in dataframe from list of lists?


I'm new to R (I've tried searching; sorry if this is repeated elsewhere!) and I need some help please! I'm trying to edit the row names in a data.frame: I start with several vcf files and create a list of lists using lapply() then flatten the list using unlist() and combine the extracted indicators into a dataframe but I end up with the following:

> row.names(mydataframe)
  [1] "1_S1_annotated_filtered.vcf.gz1"   "1_S1_annotated_filtered.vcf.gz2"   "1_S1_annotated_filtered.vcf.gz3"   "1_S1_annotated_filtered.vcf.gz6"  
  [5] "1_S1_annotated_filtered.vcf.gz7"   "1_S1_annotated_filtered.vcf.gz8"   
... 
[457] "6_S6_annotated_filtered.vcf.gz877" "6_S6_annotated_filtered.vcf.gz888" "6_S6_annotated_filtered.vcf.gz907" "7_S7_annotated_filtered.vcf.gz309"
[461] "7_S7_annotated_filtered.vcf.gz354" "7_S7_annotated_filtered.vcf.gz477" "7_S7_annotated_filtered.vcf.gz485" "7_S7_annotated_filtered.vcf.gz537"
[465] "7_S7_annotated_filtered.vcf.gz569" "7_S7_annotated_filtered.vcf.gz575" "7_S7_annotated_filtered.vcf.gz721" "7_S7_annotated_filtered.vcf.gz871"
[469] "7_S7_annotated_filtered.vcf.gz892" "8_S8_annotated_filtered.vcf.gz136" "8_S8_annotated_filtered.vcf.gz191" "8_S8_annotated_filtered.vcf.gz967"

whereas what I need is

> row.names(mydataframe)
[1] "S1"   "S1"   "S1"   "S1"  
[5] "S1"   "S1"   "S1"   "S1"
....
[469] "S7" "S8" "S8" "S8"

Any advice? Thanks in advance!


Solution

  • I would use:

     library(stringr)
     str_extract(row.names(mydataframe),"S[0-9]")