javarubyrfc2396

Is Ruby's URI.regexp "wrong" to use as a validator?


Given a url like https://example.com/<network_id>/1

Ruby's URI module can validate URIs using a regular expression:

URI.regexp.match("https://example.com/<network_id>/1")
=> #<MatchData "https://example.com/" 1:"https" 2:nil 3:nil 4:"example.com" 5:nil 6:nil 7:"/" 8:nil 9:nil>

But if you try to hand this off to another package, say Java's URI class

It will fail:

Error message:
Illegal character in path at index ...

                               java.net.URI.create(URI.java:852)
org.apache.http.client.methods.HttpPost.<init>(HttpPost.java:73)
...

Is there a better URI validator, something that we can use in a Rails class?


Solution

  • I would use URI.parse to validate a URL. URL.parse raises an exception if the URL is not valid:

    require 'uri'
    
    URI.parse('https://example.com/<network_id>/1')
    #=> bad URI(is not URI?): "https://example.com/<network_id>/1" (URI::InvalidURIError)                                       
    

    Which allows writing a simple URL validation method like this:

    def valid_url?(url)
      URI.parse(url) rescue false
    end
    
    valid_url?('https://example.com/<network_id>/1')
    #=> false
    
    valid_url?('https://stackoverflow.com/questions/75179882')
    #=> #<URI::HTTPS https://stackoverflow.com/questions/75179882>