Is it possible to read a PDF
file inside a zip
file by pdf-reader? I tried this code but it does not work.
require 'zip'
Zip::File.open('/path/to/zipfile') do |zip_file|
zip_file.each do |entry|
if entry.directory?
puts "#{entry.name} is a folder!"
elsif entry.symlink?
puts "#{entry.name} is a symlink!"
elsif entry.file?
puts "#{entry.name} is a regular file!"
reader = PDF::Reader.new("#{entry.name}")
page = reader.pages.each do |page|
puts page.text
end
else
puts "#{entry.name} is something unknown"
end
end
end
Thanks
PDF::Reader
validates that the input is a "IO-like object or a filename" based on 2 criteria.
seek
and read
File
based on File.file?
Excerpt Source:
def extract_io_from(input)
if input.respond_to?(:seek) && input.respond_to?(:read)
input
elsif File.file?(input.to_s)
StringIO.new read_as_binary(input)
else
raise ArgumentError, "input must be an IO-like object or a filename"
end
end
Unfortunately while Zip::InputStream
emulates an IO
object fairly well it does not define seek
and therefor it does not pass the validation above. What you can do is create a new StringIO
from the contents of the Zip::InputStream
via
StringIO.new(entry.get_input_stream.read)
This will guarantee that PDF::Reader
sees this as an "IO-like object" and processes it appropriately.