[SOLVED] Difference between CORB and MLCP MarkLogic

Difference between CORB and MLCP MarkLogic

Are there any differences between CORB and MLCP MarkLogic?

I see they do the same kind of job. In what scenarios you use this vs that?

Solution

CoRB and MLCP are both Java based tools that communicate with MarkLogic via the XCC protocol.

There is a lot of overlap in functionality. They can both be used to load data into the database, perform bulk transformations of documents, and to export data and generate reports.

MLCP is a supported product offering that is produced by MarkLogic
CoRB is an open source community effort

MLCP knows how to produce and consume MarkLogic Archive and makes it easy to copy data between clusters.

CoRB provides a lot of pre-built functionality, but it is also possible to customize behaviors by "plugging in" your own Java tasks or XQuery/JavaScript modules instead of using the pre-built ones that are provided.

Both provide an engine for executing bulk tasks to work with MarkLogic, customizable by properties and commandline switches, and supplying custom JavaScript or XQuery modules.

In many cases, either tool can be used to accomplish the work and it is just a matter of personal preference or expertise.

A high level overview of features to show some similarities and differences

	CoRB	MLCP
Uses XCC protocol	✅	✅
Java based	✅	✅
Commandline utility	✅	✅
Execute XQuery modules	✅	✅
Execute JavaScript modules	✅	✅
Execute custom Java tasks	✅	❌
Multiple customizable stages of processing	✅	❌
Import from CSV	✅	✅
Import files from directory	✅	✅
Import files from zip	✅	✅
Import XML file (splitting into multiple documents)	✅	✅
Import MarkLogic Archive	❌	✅
Export MarkLogic Archive	❌	✅
Bulk reprocess database records	✅	✅
Produce CSV	✅	✅
Dedup and sort exported text file	✅	❌
Export documents	✅	✅
Export as zip	✅	✅
Bulk Schema validation	✅	❌
Web UI and endpoints to display status and dynamically adjust threads or pause/resume jobs	✅	❌
Manually adjust threads or pause/resume jobs	✅	❌
Auto-scaling to adjust threads	❌	✅
MarkLogic supported product	❌	✅
Apache 2 open source license	✅	✅