marklogicmarklogic-10marklogic-optic-api

Optic API vs CTS query performance


I am using below two queries(Optic and CTS) to get the values for path-range index /tXML/Item/PutawayCategory.

Query 1: - It took approx. 4 milliseconds to execute and return 17 distinct values. I tried executing this same query multiple times.

xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic"  at "/MarkLogic/optic.xqy";

op:from-lexicons(map:entry("PutawayCategory", cts:path-reference("/tXML/Item/PutawayCategory")))
=> op:where-distinct()
=> op:result()

Query 2: - It took approx. 0.30 milliseconds to get the same result as Query 1

xquery version "1.0-ml";

cts:values(cts:path-reference("/tXML/Item/PutawayCategory"))

I am not getting why Optic Query is taking more time to execute than cts query.

Please help me to understand this.


Solution

  • Change your optic query to use op:group-by("PutawayCategory") instead of op:where-distinct() and it should perform much better.

    xquery version "1.0-ml";
    import module namespace op="http://marklogic.com/optic"  at "/MarkLogic/optic.xqy";
    
    op:from-lexicons(map:entry("PutawayCategory", cts:path-reference("/tXML/Item/PutawayCategory")))
    => op:group-by("PutawayCategory")
    => op:result()
    

    With op:from-lexicon

    Optic emits rows based on co-occurrence of lexicon values within the same document similar to cts:value-tuples.

    This means that op:from-lexicons() is returning every instance of the values that are present in multiple documents, and can be returned multiple times, not a distinct list, and then op:where-distinct() is filtering and de-duplicating, which consumes CPU and time. The larger the set of values, the more work (and time) that op:where-distinct() has to do.

    cts:values() is pulling a distinct list of values directly from the path-range-index lexicon, so there is less work to do.

    There may be a way for MarkLogic to optimize the Optic query with op:where-distinct(). If you have access to MarkLogic Support, it would be helpful if you created a Support case inquiring about it.