ruby-on-railsrubygzipbzip

Open3.popen3 function to open bz, gz, and txt files errors with 'No such file or directory' or 'not opened for reading'?


I'm trying to write a utility function that will open three different types of files: .bz2, .gz, and .txt. I can't just use File.read because it gives me garbage back for the compressed files. I'm trying to use Open3.popen3 so that I can give it a different command, but I'm getting a 'no such file or directory' error with the following code:

def file_info(file)
  cmd = ''
  if file.match("bz2") then
    cmd = "bzcat #{file}"# | head -20"
  elsif file.match("gz") then
    cmd = "gunzip -c #{file}"
  else
    cmd = "cat #{file}"
  end

  puts "opening file #{file}"
  Open3.popen3("#{cmd}", "r+") { |stdin, stdout, stderr|
    puts "stdin #{stdin.inspect}"
    stdin.read {|line|
      puts "line is #{line}"
      if line.match('^#') then
      else
        break
      end
    }
  }
end


> No such file or directory - cat /tmp/test.txt

The file does exist. I've tried using cmd instead of #{cmd} with the same results in the popen3 cmd.

I decided to hardcode it to do the txt file as follows:

def file_info(file)
  puts "opening file #{file}"
  Open3.popen3("cat", file, "r+") { |stdin, stdout, stderr|
    puts "stdin #{stdin.inspect}"
    stdin.read {|line|
      puts "line is #{line}"
      if line.match('^#') then
      else
        break
      end
    }
  }
end

This gives me back:

stdin #<IO:fd 6>
not opened for reading

What am I doing wrong?

When I do:

Open3.popen3("cat",file) { |stdin, stdout, stderr|
  puts "stdout is #{stdout.inspect}"
  stdout.read {|line|
    puts "line is #{line}"
    if line.match('^#') then
      puts "found line #{line}"
    else
      break
    end
  }
}

I get no errors and the STDOUT line is printed, but neither line statement prints out anything.

After trying several different things, the solution I came up with was:

cmd = Array.new
if file.match(/\.bz2\z/) then
  cmd = [ 'bzcat', file ]
elsif file.match(/\.gz\z/) then
  cmd = [ 'gunzip', '-c', file ]
else
  cmd = [ 'cat', file ]
end

Open3.popen3(*cmd) do |stdin, stdout, stderr|
  puts "stdout is #{stdout}"
  stdout.each do |line|
    if line.match('^#') then
      puts "line is #{line}"
    else
      break
    end
  end
end

Solution

  • From the fine manual (which is rather confusingly written):

    *popen3(cmd, &block)
    [...]
    So a commandline string and list of argument strings can be accepted as follows.

    Open3.popen3("echo a") {|i, o, e, t| ... }
    Open3.popen3("echo", "a") {|i, o, e, t| ... }
    Open3.popen3(["echo", "argv0"], "a") {|i, o, e, t| ... }
    

    So when you do this:

    Open3.popen3("cat /tmp/test.txt", "r+")
    

    popen3 thinks that the command name is cat /tmp/test.txt and r+ is an argument to that command, hence the specific error that you're seeing:

    No such file or directory - cat /tmp/test.txt

    There's no need for the usual mode flags ("r+") with Open3.popen3 since it will separate handles for reading, writing, and errors; and, as you've seen, trying to supply the mode string just causes bugs and confusion.

    The second case:

    Open3.popen3("cat", file, "r+") { |stdin, stdout, stderr|
      stdin.each {|line|
        #...
    

    Doesn't work because stdin is the command's standard input and that's what you would write to not read from, you'd want to stdout.read instead.

    You should be building your commands as arrays and your match calls should be a little stricter:

    if file.match(/\.bz2\z/) then
      cmd = [ 'bzcat', file ]
    elsif file.match(/\.gz\z/) then
      cmd = [ 'gunzip', '-c', file ]
    else
      cmd = [ 'cat', file ]
    end
    

    and then splat them:

    Open3.popen3(*cmd) do |stdin, stdout, stderr|
      #...
    end
    

    Not only does this work but it will save you from funny filenames.

    You could also avoid a useless use of cat (which someone will probably complain about) by skipping the Open3.popen3 for the non-compressed cases and using File.open instead. You might also want to consider checking the file's bytes to see what it contains rather than relying on the extension (or use ruby-filemagic to check for you).