Question on using cts:search over FLOWR. I have a xqy that runs over all docs in the database and checks an element that has a timestamp. We created that timestamp on insertion of the doc. Goal is to delete documents older then x days.
Now I need to know how many docuements I have that are older then x days so I can try a CORB job to delete them.
My query so far works:
xquery version "1.0-ml";
declare namespace j = "http://marklogic.com/xdmp/json/basic";
declare namespace dikw = 'http://www.example.com/dikw_functions.xqy';
(:let $foo := cts:uris((),(), cts:not-query(cts:element-query(xs:QName("j:dikwmetadata"), cts:element-query(xs:QName("j:data"), cts:and-query(()))))):)
let $uris := cts:uri-match("/twitter/*")[1 to 10]
let $today := fn:current-date()
let $days := xs:dayTimeDuration("P30D")
let $today_minus_x := xs:dateTime($today - $days)
for $uri in $uris (:cts:search(doc(), $random-query):)
let $doc_dikw_date := xdmp:parse-dateTime("[Y0001]-[M01]-[D01] [h01]",xs:string(fn:doc($uri)//j:dikwmetadata//j:timestamp))
let $to_old := if ($today_minus_x >= $doc_dikw_date)
then
true() (: deleted document:)
else
false()
return ($uri,$to_old)
This works ok but I need to know how many there are to see if I can run it from the query console or that I need to set up a sheduled CORB job running every day.
I was looking into cts:search something like:
(:
let $uris2 := cts:search($uris,cts:query(xdmp:parse-dateTime("[Y0001]-[M01]-[D01] [h01]",xs:string(fn:doc($uris)//j:dikwmetadata//j:timestamp))) < $today_minus_x)
:)
But this seems to need elements ... no I am stuck.
Questions: is there a more straightforward way to find and count all documents older then x days?
One of the problem with your current code is that you are parsing dates at run-time. That is always going to be slow, because it needs access to the XML itself.
This would work best if your j:timestamp element would contain a string matching xs:date or xs:dateTime. Then you can declare a (path) range index on that element of type date/dateTime (whatever suits you best).
Alternative is to create something like iso-date(Time) attribute on that element containing a preparsed date of type xs:date(Time), so you can index that one.
Once you have a range index, you can do a (path-)range-query on your element. You could then also use cts:uris to get the docs that need to be deleted..
HTH!