I am using below two queries(Optic and CTS) to get the values for path-range index /tXML/Item/PutawayCategory.
Query 1: - It took approx. 4 milliseconds to execute and return 17 distinct values. I tried executing this same query multiple times.
xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";
op:from-lexicons(map:entry("PutawayCategory", cts:path-reference("/tXML/Item/PutawayCategory")))
=> op:where-distinct()
=> op:result()
Query 2: - It took approx. 0.30 milliseconds to get the same result as Query 1
xquery version "1.0-ml";
cts:values(cts:path-reference("/tXML/Item/PutawayCategory"))
I am not getting why Optic Query is taking more time to execute than cts query.
Please help me to understand this.
Change your optic query to use op:group-by("PutawayCategory") instead of op:where-distinct() and it should perform much better.
xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";
op:from-lexicons(map:entry("PutawayCategory", cts:path-reference("/tXML/Item/PutawayCategory")))
=> op:group-by("PutawayCategory")
=> op:result()
With op:from-lexicon
Optic emits rows based on co-occurrence of lexicon values within the same document similar to
cts:value-tuples.
This means that op:from-lexicons() is returning every instance of the values that are present in multiple documents, and can be returned multiple times, not a distinct list, and then op:where-distinct() is filtering and de-duplicating, which consumes CPU and time. The larger the set of values, the more work (and time) that op:where-distinct() has to do.
cts:values() is pulling a distinct list of values directly from the path-range-index lexicon, so there is less work to do.
There may be a way for MarkLogic to optimize the Optic query with op:where-distinct(). If you have access to MarkLogic Support, it would be helpful if you created a Support case inquiring about it.