regexsparqlgraphdb

SPARQL query - extracting base of URLs


I want to extract only the base of a url in my SPARQL GraphDB query. E.g. www.schema.org instead of http://www.schema.org/archive

I tried matching the complete url by using the bind and replace function and then capturing the base as a group. However this seems to not work properly. I'm guessing I've done something wrong within the regex?

# What is the hosting website of the article published online in May 2001 with James Hendler as one of its authors?


PREFIX : <http://www.mysemantics.com/> 
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?hostingWebsite
WHERE {
    ?article a :Article . 
    ?article :datePublished "2001"^^xsd:date .
    ?article :hasAuthor :JamesHendler .
    ?article :url ?url . 
    BIND(REPLACE(?url, ".*://(.*?)/.*", "$1") AS ?hostingWebsite)
}

Solution

  • You were so close! A URI is not a string, so you need to convert to a string first:

        BIND(REPLACE(STR(?url), ".*://(.*?)/.*", "$1") AS ?hostingWebsite)