Marklogic version : 9.0-6.2 mlcp version: 9.0.6
I am trying to import XML file into marklogic using MLCP uisng below code.
#!/bin/bash
mlcp.sh import -ssl \
-host localhost \
-port 8010 \
-username uname \
-password pword \
-mode local \
-input_file_path /data/testsource/*.XML \
-input_file_type documents \
-aggregate_record_namespace "http://new.webservice.namespace" \
-output_collections testcol \
-output_uri_prefix /testuri/ \
-transform_module /ext/ingesttransform.sjs
The code is running successfully with a small file but giving 'java heap space' error when run with large file (450 MB).
ERROR contentpump.MultithreadedMapper: Error closing writer: Java heap space
How could we resolve this error?
The mlcp job is designed to send the whole input file as one single document (-input_file_type documents) of size 500 MB into the transform module. The transform module has logic to spit uris and value (content.uri and content.value) for each aggregate element. This is resulting in java heap space error even though the heap space available on server is around 3.4 GB.
I tried two different designs that are working.
Both approaches are working, but the second approach may create documents with size of 500 MB (I believe the size limit is 512 MB). So I opted to use the first approach (also, I need a better uri than the default created by mlcp).