marklogicmlcp

Can we change the XML structure of a file during file ingestion in MarkLogic using mlcp?


I have a xml file to ingest in MarkLogic database where a new XML field has to be added . And the requirement is to add that XML field only during the mlcp import. Is this possible in MarkLogic using xquery?

XML file now -

<name>rashmita</name>
<employeeType>regular</employeeType>

XML to be changed -

<name>rashmita</name>
<employeeType>regular</employeeType>
<role>developer</role>

Solution

  • Yes, it is possible to modify the payload on ingest with MLCP.

    Transforming Content During Ingestion

    You can create an XQuery or Server-Side JavaScript function and install it on MarkLogic Server to transform and enrich content before inserting it into the database. Your function runs on MarkLogic Server.

    Function Signature

    A custom transformation is an XQuery function module that conforms to the following interface. Your function receives a single input document, described by $content, and can generate zero, one, or many output documents.

    declare function yourNamespace:transform(
      $content as map:map,
      $context as map:map)
    as map:map*
    

    So, for your example (assuming that the actual docs are well-formed XML) could look something like:

    module namespace example = "http://marklogic.com/example";
    declare function example:transform(
      $content as map:map,
      $context as map:map
    ) as map:map*
    {
      let $doc := map:get($content, "value")
      let $root := $doc/*
      let $_ :=
        if ($root)
        then 
          map:put($content, "value", 
            document { element {$root/name()} {$root/@*, $root/*, <role>developer</role>} })
        else ()
      return $content
    };
    

    Using a Custom Transformation

    Once you install a custom transformation function on MarkLogic Server, you can apply it to your mlcp import or copy job using the following options:

    • transform_module - The path to the module containing your transformation. Required.
    • transform_namespace - The namespace of your transformation function. If omitted, no namespace is assumed.

    An example invocation setting those parameters:

    mlcp.sh import -mode local -host mlhost -port 8000 -username user -password password -input_file_path /space/mlcp-test/data -transform_module /example/mlcp-transform.xqy -transform_namespace "http://marklogic.com/example"