I am using below two queries(Optic and CTS)
to get the values for path-range index /tXML/Item/PutawayCategory
.
Query 1: - It took approx. 4 milliseconds
to execute and return 17 distinct values. I tried executing this same query multiple times.
xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";
op:from-lexicons(map:entry("PutawayCategory", cts:path-reference("/tXML/Item/PutawayCategory")))
=> op:where-distinct()
=> op:result()
Query 2: - It took approx. 0.30 milliseconds
to get the same result as Query 1
xquery version "1.0-ml";
cts:values(cts:path-reference("/tXML/Item/PutawayCategory"))
I am not getting why Optic Query
is taking more time to execute than cts
query.
Please help me to understand this.
Change your optic query to use op:group-by("PutawayCategory")
instead of op:where-distinct()
and it should perform much better.
xquery version "1.0-ml";
import module namespace op="http://marklogic.com/optic" at "/MarkLogic/optic.xqy";
op:from-lexicons(map:entry("PutawayCategory", cts:path-reference("/tXML/Item/PutawayCategory")))
=> op:group-by("PutawayCategory")
=> op:result()
With op:from-lexicon
Optic emits rows based on co-occurrence of lexicon values within the same document similar to
cts:value-tuples
.
This means that op:from-lexicons()
is returning every instance of the values that are present in multiple documents, and can be returned multiple times, not a distinct list, and then op:where-distinct()
is filtering and de-duplicating, which consumes CPU and time. The larger the set of values, the more work (and time) that op:where-distinct()
has to do.
cts:values()
is pulling a distinct list of values directly from the path-range-index lexicon, so there is less work to do.
There may be a way for MarkLogic to optimize the Optic query with op:where-distinct()
. If you have access to MarkLogic Support, it would be helpful if you created a Support case inquiring about it.