rpdfpdftools

I have two sets of pdf from different folders that i went to join as one based on the same name and output in the same folder of first pdf group


I have two folder directory

directory1<-"C:/Folder1/"
directory2<-"C:/Folder2/"

Folder 1 contains file

"123456.pdf", "234567.pdf", "345678.pdf", "456789.pdf"

Folder 2 contains file

"123456_Jon.pdf","234567_Mike.pdf", "345678_Bill.pdf","456789_Ralph.pdf","Random_file.pdf"

If the pdf's in folder 1 and 2 share the first 6 numbers then i want to join them and create a new file in directory1 named

"123456_Join.pdf","234567_Join.pdf","345678_Join.pdf","456789_Join.pdf"

Solution

  • Suppose your filenames are stored in

    files_1 <- c("123456.pdf", "234567.pdf", "345678.pdf", "456789.pdf")
    files_2 <- c("123456_Jon.pdf","234567_Mike.pdf", "345678_Bill.pdf","456789_Ralph.pdf","Random_file.pdf")
    
    library(qpdf)
    
    for (file in files_1) {
      ext_num <- sub("(^\\d{6}).*", "\\1", file)
      target  <- grepl(paste0("^", ext_num), files_2)
    
      if (!any(target)) next
      
      pdf_combine(c(file, file.path(directory2, files_2[target])),
                  output = paste(directory1, ext_num, "Join.pdf", sep = "_"))
      
    }
    

    should give you your desired output.