I have problems with this code:
require 'rubygems'
require 'rdf'
require 'rdf/raptor'
RDF::Reader.open("http://reegle.info/countries/IN.rdf") do |reader|
reader.each_statement do |statement|
puts statement.inspect
end
end
When trying to open the above mentioned url, I get redirected to an url, which URI.parse obviously doesn´t like:
http://sparql.reegle.info?query=CONSTRUCT+{+%3Chttp://reegle.info/countries/IN%3E+?p+?o.+%3Chttp://reegle.info/countries/IN.rdf%3E+foaf:primaryTopic+%3Chttp://reegle.info/countries/IN%3E;+cc:license+%3Chttp://www.nationalarchives.gov.uk/doc/open-government-licence%3E;+cc:attributionName+"REEEP";+cc:attributionURL+%3Chttp://reegle.info/countries/IN%3E.+}+WHERE+{+%3Chttp://reegle.info/countries/IN%3E+?p+?o.}&format=application/rdf%2Bxml
So I get the following error:
URI::InvalidURIError: bad URI(is not URI?)
Any ideas, how to get around this issue?
Thanks
P.S. Doing something like URI.parse(URI.encode([url]))) does not have any effects here.
URI doesn't like the double quotes or braces in that URL. You can fix the URI by hand with something like this:
# This auto-populating cache isn't necessary but...
replacements = Hash.new { |h,k| h[k] = URI.encode(k) }
broken_uri.gsub!(/[{}"]/) { replacements[$&] }
From RFC 1738: Uniform Resource Locators (URL):
Thus, only alphanumerics, the special characters "
$-_.+!*'(),
", and reserved characters used for their reserved purposes may be used unencoded within a URL.
So I'd say that reegle.info should be URL-encoding more things than they are. OTOH, Ruby's URI class could be a little more forgiving (Perl's URI class, for example, will accept that URI as input but it converts the double quote and braces to their percent-encoded form on output).