ruby-on-railsrubypuma

What in exifr causes this Tempfile to get closed?


In this piece of Ruby code that processes an UploadedFile using exifr

f = uploaded_file.tempfile
p "1 #{f.closed?} #{f.instance_variable_get(:'@unlinked')}"
#1 EXIFR::JPEG.new(StringIO.new(f.read))
#2 EXIFR::JPEG.new(f)
p "2 #{f.closed?} #{f.instance_variable_get(:'@unlinked')}"
GC.start
sleep 0.01
p "3 #{f.closed?} #{f.instance_variable_get(:'@unlinked')}"
p "4 #{f.size}"

N.B. GC.start/sleep is there to make the problem replicate reliably.

when uncommenting #1, all is fine:

"1 false false"
"2 false false"
"3 false false"
"4 3822528"

However, the outcome of uncommenting #2, instead of #1, yields this:

"1 false false"
"2 false false"
"3 true false"
[c4b7ce6b-5492-43db-8c64-726cafaccce0] [Thread: 24800] Errno::ENOENT (No such file or directory @ rb_file_s_size - /var/folders/vx/v0rn818s0257_3l491_v48bm0000gn/T/RackMultipart20240221-71765-acbi7v.JPG):

Now all that exifr is doing is this:

    def initialize(file, load_thumbnails: true)
...
        examine(file.dup, load_thumbnails: load_thumbnails)
...
      end
    end

    class Reader < SimpleDelegator
      def readbyte; readchar; end unless File.method_defined?(:readbyte)
      def readint; (readbyte << 8) + readbyte; end
      def readframe; read(readint - 2); end
      def readsof; [readint, readbyte, readint, readint, readbyte]; end
      def next
        c = readbyte while c != 0xFF
        c = readbyte while c == 0xFF
        c
      end
    end

    def examine(io, load_thumbnails: true)
      io = Reader.new(io)
...

and a bit of reading from io, so I don't understand what would cause the file to get closed.

This happens in a Rails app running on puma.

#2 would be preferable, as it does not require the file to be loaded into memory completely (in my case, we are talking up to 50 MB).


Solution

  • Thanks to @Casper, I understood I got duped by f.dup - wouldn't have thought that part of the standard Ruby library would behave this way - deleting a Tempfile when a (dup'ed) reference is still around.

    The way I chose to fix this is different from Casper's solutions, though, because I already have an open Tempfile, and I want to use it, instead of concurrently re-opening the file. (Who knows what implications that would have, on different OSes?)

    So this is how I fixed it:

    EXIFR::JPEG.new(SelfDuper.new(f))
    

    And this is the helper class I wrote for it:

    require 'delegate'
    
    class SelfDuper < SimpleDelegator
      def dup
        self
      end
    end