marklogicmarklogic-corb

How to save a list of URIs matching a pattern in Marklogic with corb?


I need some help with MarkLogic, XQuery & corb,

I have millions of documents in the database, I'm trying to write the XQuery to saved the matched uris.

urisVersions.xqy

xquery version "1.0-ml";
let $uris := cts:uri-match("*versions/*version-*")

return (fn:count($uris), $uris)

urisSave.xqy

xquery version "1.0-ml";
declare variable $URI as xs:string external;

let $uri := $URI 
return xdmp:save("/tmp/test",$uri)

save-job.properties

XCC-CONNECTION-URI= xcc://user:admin@localhost:8000/
URIS-MODULE=urisVersions.xqy|ADHOC
XQUERY-MODULE=urisSave.xqy|ADHOC
THREAD-COUNT=10

Getting below error SEVERE: fatal error com.marklogic.developer.corb.CorbException: Invalid argument type at URI: /12312/versions/item/papkov.xml.version-24

Can anyone please help me to solve this issue?


Solution

  • Configure the job with the PROCESS-TASK option to use the com.marklogic.developer.corb.ExportBatchToFileTask class, which will write the results of each process module invocation to an output file. You can configure where to write the file and the filename with EXPORT-FILE-NAME and EXPORT-FILE-DIR options. If you don't configure the EXPORT-FILE-DIR and just give it a filename with EXPORT-FILE-NAME it writes relative from where CoRB is launched.

    PROCESS-TASK=com.marklogic.developer.corb.ExportBatchToFileTask
    EXPORT-FILE-NAME=versionsURIs.txt
    

    Change your process module to simply return the $URI value:

    xquery version "1.0-ml";
    declare variable $URI as xs:string external;
    $URI
    

    If you just want to write the URIs to a file and aren't transforming or doing any processing, then you could use the ModuleExecutor class and have it write the results of the cts:uri-match directly to the output file.