I'm working on project that requires a table I found on a website to be read into R studio and formatted so I can create graphical representations of the data. I have been attempting to do this via the magick
package using the image_read
and image_display
functions. I continue to get the following error after I attempt to display the image:
Error: ImageMagick was built without X11 support
and no real output in my dataframe object.
Here is what I most recently tried to get this to work:
img <- image_read("https://i2.wp.com/www.brookings.edu/wp-content/uploads/2022/01/Table-2.png?w=768&crop=0%2C0px%2C100%2C9999px&ssl=1",
density = "300")
image_display(img)
img_data <- image_data(img)
table_df <- data.frame(img_data)
table_df
This returns the following error:
Error in as.data.frame.default(x[[i]], optional = TRUE) :
cannot coerce class ‘c("bitmap", "rgb")’ to a data.frame
I need it to return the data from the image as a dataframe that I can then manipulate for different graphical representations.
I was able to solve this using the tesseract
package in R using a pdf version of the table.
library(tesseract)
# Load the PDF file and convert to text
pdf_file <- "C:/Users/Table_Q2.pdf"
text <- tesseract::ocr(pdf_file)