aemaem-6

How to optimize Querybuilder query


I have this query -

group.p.or=true
type=cq:Page
p.limit=10
group.1_group.path=/content/path/path1
group.1_group.1_group.p.or=true
group.1_group.1_group.1_property.value=false
group.1_group.1_group.2_property=jcr:content/pageTemplateType
group.1_group.1_group.1_property=jcr:content/pageTemplateType
group.1_group.1_group.2_property.operation=unequals
group.1_group.1_group.1_property.operation=exists
group.1_group.1_group.2_property.value=template
group.1_group.path.self=true

group.2_group.path=/content/path/path-2
group.2_group.1_group.p.or=true
group.2_group.1_group.1_property.value=false
group.2_group.1_group.2_property=jcr:content/pageTemplateType
group.2_group.1_group.1_property=jcr:content/pageTemplateType
group.2_group.1_group.2_property.operation=unequals
group.2_group.1_group.1_property.operation=exists
group.2_group.1_group.2_property.value=template
group.2_group.path.self=true

What I am trying to do is that, query multiple paths and return the paths which has the property pageTemplateType value not equal to 'template' or the property pageTemplateType does not exists.

This query works fine but it takes long time more than 1 second. But if I just remove the self i.e group.2_group.path.self=true or group.1_group.path.self=true then it takes around only 0.02 second. So I do not understand how to optimize it, how to use self efficiently.


Solution

  • Unfortunately a query with path.self=true will NEVER be fast in AEM. Better try to use another query, without the path.self=true.

    Internally the QueryBuilder uses the PredicateEvaluator's to construct an XPath query PredicateEvaluator.getXPathExpression(...). This translation is done with "best effort". Then the results of the XPath-query is filtered by all remaining predicates (which couldn't be fully converted to Xpath) PredicateEvaluator.includes(...). See Predicate API

    Now the PathPredicateEvaluator has a problem. The JCR-Path (at least in JackRabbit) does not support the descendant-or-self axis. So the XPath query /content/path/path1/descendant-or-self::node() is not supported. As result the XPath Query will search the entire repository, and the Path-Predicate uses the afterwards filtering. Probably it had been better, to search from the parent-node, instead from the repository-root. But thats the way it is implemented.

    You can check that in the query debugger. http://localhost:4502/libs/cq/search/content/querydebug.html


    You can test this behaviour with 2 simplified queries (in the query-debugger)

    type=cq:Page
    path=/content/path/path1
    

    The above query is straight translated into the following XPath:

    /jcr:root/content/path/path1//element(*, cq:Page)
    

    Now with self=true:

    type=cq:Page
    path=/content/path/path1
    path.self=true
    

    The above query is translated in the following XPath

    //element(*, cq:Page)
    

    And after iterating over all pages, the XPath-result-set is filtered with the following Java-Predicate:

    {path=path: path=/content/path/path1, self=true}
    

    Proposal to your issue:

    1. Maybe you can use JCR-SQL2 queries, or a custom XPath-Query. For so complicated queries the QueryBuilder is not the best-choice anyway. But maybe you need to use it from the front-end. Then better stick with QueryBuilder, as only the QueryBuilder has a REST-API. The other options require Java-Code.

    2. You searching for Page-Attributes. Instead of searching for a cq:Page, you can search for cq:PageContent. cq:PageContent-nodes will always be child-nodes (even for the root node). Then you don't need to include self. You only need to remove /jcr:content from the result-path (the page will be the parent of the result-node)

    Here is the query for approach 2)

    group.p.or=true
    type=cq:PageContent
    p.limit=10
    group.1_group.path=/content/path/path1
    group.1_group.1_group.p.or=true
      group.1_group.1_group.1_property=pageTemplateType
      group.1_group.1_group.1_property.operation=exists
      group.1_group.1_group.1_property.value=false
    
      group.1_group.1_group.2_property=pageTemplateType
      group.1_group.1_group.2_property.operation=unequals
      group.1_group.1_group.2_property.value=template
    
    
    group.2_group.path=/content/path/path-2
    group.2_group.1_group.p.or=true
      group.2_group.1_group.1_property=pageTemplateType
      group.2_group.1_group.1_property.operation=exists
      group.2_group.1_group.1_property.value=false
    
      group.2_group.1_group.2_property=pageTemplateType
      group.2_group.1_group.2_property.operation=unequals
      group.2_group.1_group.2_property.value=template
    

    A starter XPath for approach 1) which searches /content/path/path1 and /content/path/path-2 would be the following (first use a XPath Union, and then the property condition):

    (
        /jcr:root/content/path/element(path1, cq:Page)
      | /jcr:root/content/path/path1//element(*, cq:Page)
      | /jcr:root/content/path/element(path-2, cq:Page)
      | /jcr:root/content/path/path-2//element(*, cq:Page)
    )
    [
      jcr:content/@pageTemplateType != 'template'
      or not(jcr:content/@pageTemplateType)
    ]
    

    PS: If the above XPath works depends on you indexes and the content. Probably it needs some improvement.