I am quite new to R. I have few text (.txt) files in a folder that have been converted from PDF with page break character (#12). I need to produce a data frame by reading these text files in R with condition that one row in R represents one PDF page. It means that every time there is a page break (\f), it will only then create a new row.
The problem is when the text file gets load into R, every new line became a new row and I do not want this. Please assist me on this. Thanks!
Some methods that I have tried are read.table and readLines.
Does something like this work?
library(tidyverse)
read_file("mytxt.txt") %>%
str_split("␌") %>%
unlist() %>%
as_tibble_col("data")
It just reads the file as raw text then splits afterwards. You may have to replace the splitting character with something else.