rubyopen-uri

OpenURI fails to follow URLs that have %20


I am having some issues with Ruby's OpenURI follow redirect functionality.

When going to a URL that contains %20 in it, and that redirects with a 30x, Ruby's OpenURI fails.

Code

require 'open-uri'

base = "http://software-engineering-handbook.com/Handbook"

puts "===> PASS: URI Open +"
result = open "#{base}/Video+Series"
p result.status

puts "===> PASS: Curl +"
puts `curl -LIsS "#{base}/Video+Series" | grep HTTP`

puts "===> PASS: Curl %20"
puts `curl -LIsS "#{base}/Video%20Series" | grep HTTP`

puts "===> FAIL: URI Open %20"
begin
  result = open "#{base}/Video%20Series"
  p result.status
rescue => e
  puts "#{e.class} #{e.message}"
end

Output

===> PASS: URI Open +
["200", "OK"]
===> PASS: Curl +
HTTP/1.1 200 OK
===> PASS: Curl %20
HTTP/1.1 303 See Other
HTTP/1.1 200 OK
===> FAIL: URI Open %20
OpenURI::HTTPError 302 Found (Invalid Location URI)

I am not sure what is going on here. Tried HTTParty (although I know it is just a wrapper), hoping to see a different behavior, but it also fails.


Solution

  • The server is responding with an redirect to an invalid URI. curl is being lax about it, but Ruby is being strict.

    If we print out the e.cause we get more information.

    #<URI::InvalidURIError: bad URI(is not URI?): "http://software-engineering-handbook.com/Handbook/Video Series/">
    

    And also by looking at the headers from curl -I 'http://software-engineering-handbook.com/Handbook/Video%20Series'...

    HTTP/1.1 303 See Other
    Server: Cowboy
    Date: Sat, 28 Dec 2019 21:41:28 GMT
    Connection: keep-alive
    Content-Type: text/html;charset=utf-8
    Location: http://software-engineering-handbook.com/Handbook/Video Series/
    

    And, indeed, the server is returning an invalid URI. Spaces are not allowed in a URI path. Ruby's URI class will not parse it.

    > URI("http://software-engineering-handbook.com/Handbook/Video Series/")
    URI::InvalidURIError: bad URI(is not URI?): "http://software-engineering-handbook.com/Handbook/Video Series/"
    from /Users/schwern/.rvm/rubies/ruby-2.6.5/lib/ruby/2.6.0/uri/rfc3986_parser.rb:67:in `split'