We recently upgraded our MarkLogic cluster from version 10.0-6.1 to 11.3. Since the upgrade, we have experienced a severe performance regression with some of our previously well-performing SPARQL queries.
Background:
MarkLogic 10.0-6.1
), a typical query as shown
below used to complete in approximately 2 seconds.MarkLogic 11.3
), the same query now takes 58
seconds or more to return results.We rely heavily on semantic triples and SPARQL for our knowledge graph, so this degradation is causing significant issues for our application.
let $query :=
"PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX cabiontology: <https://id.cabi.org/cabiontology/>
PREFIX compendium: <https://id..org/compendiumOntology/>
PREFIX dbProject: <https://id..org/cabt_dev/>
SELECT DISTINCT
?pestDatasheetId
?pestPrefLabel
?prefCommonName
(IF(?filterText != '', ?matched, '') as ?matchedText)
(IF(BOUND(?compDomain), ?compDomain,
IF(str(?phylumPrefLabel) != '', ?phylumPrefLabel,
IF(str(?domainPrefLabel) != '', ?domainPrefLabel, 'No data'))) as ?pestType)
FROM <http://marklogic.com/semantics#default-graph>
WHERE
{
# Filtering for pests with the given type and labels
BIND('ash dieback' as ?filterText)
?type skos:prefLabel ?typeLabel.
FILTER(?typeLabel IN ('Pest'@en-gb))
?pestConceptUri compendium:datasheetType ?type.
?pestConceptUri skos:prefLabel ?pestPrefLabel.
FILTER(lang(?pestPrefLabel) = 'en-gb')
?pestConceptUri ontology:compendiumDatasheetAt ?pestDatasheetId.
OPTIONAL {
?pestConceptUri ontology:prefCommonName ?prefCommonName.
FILTER(lang(?prefCommonName) = 'en-gb')
}
OPTIONAL {
?pestConceptUri skos:altLabel ?pestAltLabel.
}
FILTER(contains(lcase(?pestPrefLabel), ?filterText)
|| contains(lcase(?prefCommonName), ?filterText)
|| contains(lcase(?pestAltLabel), ?filterText))
OPTIONAL {
?pestConceptUri compendium:taxonRank-Domain ?compDomain
FILTER(?compDomain = 'Diseases of unknown aetiology')
}
OPTIONAL {
?pestConceptUri skos:broader* ?domainNode.
?domainNode ontology:taxonRank dbProject:taxaDomain.
?domainNode skos:prefLabel ?domainPrefLabel.
FILTER(lang(?domainPrefLabel) = 'en-gb')
}
OPTIONAL {
?pestConceptUri skos:broader* ?phylumNode.
?phylumNode ontology:taxonRank dbProject:taxaPhylum.
?phylumNode skos:prefLabel ?phylumPrefLabel.
FILTER(lang(?phylumPrefLabel) = 'en-gb')
}
BIND(IF(contains(lcase(?pestPrefLabel), ?filterText), ?pestPrefLabel,
IF(contains(lcase(?prefCommonName), ?filterText), ?prefCommonName,
IF(contains(lcase(?pestAltLabel), ?filterText), ?pestAltLabel, ''
))) as ?matched)
}
ORDER BY ?pestPrefLabel"
return sem:sparql($query)
What I've checked so far:
Any insights, suggestions are greatly appreciated!
There is a new feature in Marklogic 11 that relates to overflowing to disk in order to protect memory. It is only listed as an Optic feature. However, since optic is SPARQL under the hood, maybe the feature is kicking in and using disk.
The link below describes the feature and also various ways to see if it is being used.