My Wikidata Query Service query sometimes takes 35000 milliseconds or 35 seconds to complete. I'm not very adept at SPARQL. The query below does work (except it does duplicates sometimes). I'm wanting to get "famous people" by providing a birthday date and month, of which I get their name, birthday, an image of them (Wikimedia), and occupations. I'm also filtering by persons with birth places in the USA and UK.
I've added a variable called "sitelinks" that I count how many links are dedicated to them as a popularity metric (I'm open to better ideas on this, if there is a better way to gauge popularity). Is there a way to make this more optimized? Again, the query works, it's just slow.
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?person ?personLabel ?birthdate ?countryLabel (COUNT(DISTINCT(?sitelink)) as ?sites) (GROUP_CONCAT(DISTINCT ?occupationLabel; separator=", ") as ?occupations) (SAMPLE(?image) as ?uniqueImage)
WHERE {
?person wdt:P31 wd:Q5 ; # Instance of human
wdt:P569 ?birthdate ; # Date of birth
wdt:P27 ?country ; # Citizenship
wdt:P106 ?occupation ; # Occupation
wdt:P18 ?uniqueImage . # Image
?country rdfs:label ?countryLabel .
?occupation rdfs:label ?occupationLabel .
?sitelink schema:about ?person .
FILTER(LANG(?countryLabel) = "en")
FILTER(LANG(?occupationLabel) = "en")
FILTER(MONTH(?birthdate) = 5 && DAY(?birthdate) = 20)
FILTER(?country IN (wd:Q30, wd:Q145))
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?person ?personLabel ?birthdate ?countryLabel ?uniqueImage ORDER BY DESC(?sites)
LIMIT 50
If anyone wants to paste the query to the Wikidata Query Service, here is the link https://query.wikidata.org/
I think it's the site count that's killing your query and I think I resolved the duplicate record issue. Would something like this work for your use-case?
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT
?person
?personLabel
?birthdate
?countryLabel
?website
(GROUP_CONCAT(DISTINCT ?occupationLabel; separator=", ") AS ?occupations)
(MIN(?image) AS ?uniqueImage)
WHERE {
?person wdt:P31 wd:Q5 ; # Instance of human
wdt:P569 ?birthdate ; # Date of birth
wdt:P27 ?country . # Citizenship
OPTIONAL { ?person wdt:P856 ?website } # Official website
?country rdfs:label ?countryLabel .
FILTER(?country IN (wd:Q30, wd:Q145))
FILTER(LANG(?countryLabel) = "en")
FILTER(DATATYPE(?birthdate) = xsd:dateTime
&& MONTH(?birthdate) = 5
&& DAY(?birthdate) = 20)
OPTIONAL {
?person wdt:P106 ?occupation . # Occupation
?occupation rdfs:label ?occupationLabel .
FILTER(LANG(?occupationLabel) = "en")
}
OPTIONAL { ?person wdt:P18 ?image } # Image
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?person ?personLabel ?birthdate ?countryLabel ?website
ORDER BY DESC(?website)
LIMIT 50
If you do need the site count though you might be able to use something like this
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT
?person
?personLabel
?birthdate
?countryLabel
(COUNT(DISTINCT ?sitelink) AS ?sites)
(GROUP_CONCAT(DISTINCT ?occupationLabel; separator=", ") AS ?occupations)
(MIN(?image) AS ?uniqueImage)
WHERE {
?person wdt:P31 wd:Q5 ; # Instance of human
wdt:P569 ?birthdate ; # Date of birth
wdt:P27 ?country . # Citizenship
OPTIONAL { ?person wdt:P18 ?image } # Image
?sitelink schema:about ?person .
?country rdfs:label ?countryLabel .
FILTER(?country IN (wd:Q30, wd:Q145))
FILTER(LANG(?countryLabel) = "en")
FILTER(DATATYPE(?birthdate) = xsd:dateTime
&& MONTH(?birthdate) = 5
&& DAY(?birthdate) = 20)
OPTIONAL {
?person wdt:P106 ?occupation . # Occupation
?occupation rdfs:label ?occupationLabel .
FILTER(LANG(?occupationLabel) = "en")
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?person ?personLabel ?birthdate ?countryLabel
ORDER BY DESC(?sites)
LIMIT 50