rstringextracttext-extractionrscript

String extract str_extract with R script


I have files which have this structure: xxx_xxx_xxx_xxx_class0.png and xxx_xxx_xxx_xxx_class1.png. I want to select only class0 or class1 images.

I did:

 ## List images in path
images_names <- list.files("/img/train",pattern="\\.png$",recursive = TRUE)
if(labelsExist){
    ## Select only class0 or class1 images
    classLb <- str_extract(images_names, "^(class0|class1)")
    # Set class0 == 0 and class1 == 1
    key <- c("class0" = 0, "class1" = 1)
    y <- key[classLb]
  }

When I perform, head(classLb), I have only NA in output. Any suggestions ?


Solution

  • use sub and a regex pattern .*([a-z]+).png. This claims that to delete everything until a _ followed a word then .png

    sub('.*_([a-z]+).png', '\\1', image_names)