I used readLines from gutenbergr to read my list of legal cases into R. The code looks like this:
my_list <- list.files(path = "C:\\Users\\Ben Tice\\Documents\\R Stuff\\UW Job\\Cases\\data\\txts", pattern = "\\.txt$")
cases_lines <- lapply(my_list, readLines)
I can convert individual cases successfully like this:
df_convert <- data.frame(line=1:length(cases_lines[[1]]), text=cases_lines[[1]])
But I would like to be able to use data.frame on all 75 cases without having to convert each one separately.
I tried using the lapply function as well as for loops, but I cannot get either of them to work. For example,
df_convert2 <- lapply(cases_lines, data.frame(line=1:length(cases_lines[[i]]), text=cases_lines[[i]]))
runs but produces the following error message: "Error in cases_lines[[i]] : recursive indexing failed at level 2."
Ultimately, I need a list of the cases as data frames, so I can iterate through them with stringr functions to look for character patterns.
Since you have all of your case files in a list, you can iterate over the list, convert each case file into a data frame, and store the data frames in another list. Maybe something like this can help:
cases_df <- list()
for(i in 1:length(cases_lines)){
cases_df[[i]] <- data.frame(line=1:length(cases_lines[[i]]), text=cases_lines[[i]])
}
This should create a list of 75 data frames of each case file.