rdataframepdftoolstabulizer

How to remove column labels if the name of the label starts with "G" in R programming


How to remove column labels if the name of the label starts with "G"

code:

library(pdftools)
library(data.table)
library(tabulizer)
pdf_file <- "new.pdf"

out2 <- extract_tables(pdf_file, pages =c(89), output = "data.frame")
out2<-as.data.table(out2)
colnames(out2)

Actual output:

"Group.1" "Day.7"   "Day.8" "Day.9"
"Group.2" "Day.10" "Day.11", "Day.12"

Expected Output:

"Day.7"   "Day.8" "Day.9"
"Day.10" "Day.11", "Day.12"

Also Please please suggest to me any other R packages(other than pdftools and tabulizer) that extract Datatables from PDF


Solution

  • This will drop columns that start with "G":

    result <- out2[, !startsWith(names(out2), "G")]