marklogicapache-nifimarklogic-dhf

Set Headers when Ingesting from Nifi into MarkLogic Data Hub


When I ingest a document into the MarkLogic Data Hub, then some headers are created automatically in the JSON document. Example:

"headers": {
  "sources": [
    {
      "name": "customer-db-a"
    }
  ],
  "createdOn": "2020-03-11T13:31:28.6069705+01:00",
  "createdBy": "admin"
}

Is it possible to set the header of the source dynamically, when I ingest from Apache Nifi (with the mlRunIngest transformation)? I would like to reuse the same ingestion step for multiple sources.


Solution

  • In Nifi, in the MarkLogic processor, a custom property named "trans:options" can be used to pass in JSON headers. In the headers, the sources can be defined.

    Example: enter image description here

    Additionally, the placeholders currentDateTime and currentUser can be used in the header for setting the current timestamp and the current user:

    {
        "headers" : {
          "sources" : [
            {
                "name": "my-nifi-source"
            }
          ],
          "createdOn" : "currentDateTime",
          "createdBy" : "currentUser"
        }
    }