I'm trying to use a regular expression with -output_uri_replace, and it is failing.
Here is my options file:
import
-host
localhost
-port
8877
-username
xxxx
-password
xxxx
-input_file_path
to-import
-output_uri_replace
^.*(/[^/]+/),$1
I consistently get this error:
$ import
java.lang.IllegalArgumentException: Invalid option argument for output_uri_replace :^.*(/[^/]+/),$1
at com.marklogic.contentpump.Command.applyCommonOutputConfigOptions(Command.java:2422)
at com.marklogic.contentpump.Command$1.applyConfigOptions(Command.java:470)
at com.marklogic.contentpump.Command$1.createJob(Command.java:370)
at com.marklogic.contentpump.ContentPump.runCommand(ContentPump.java:238)
at com.marklogic.contentpump.ContentPump.main(ContentPump.java:74)
which is obviously not very helpful. I don't know why it doesn't report the error from the regex engine, which is capable of giving useful messages.
I tried with and without quotes. No difference.
Just to prove it works, here is what happens in Groovy (which uses the same regex engine as Java)
This is exactly the result I want. And by the way, wouldn't this be a common case? You have a bunch of stuff in the directory immediately below and you want to lop off everything but that?
Couldn't there be a working example in the documentation?
And why the @#$% doesn't this work in MLCP when it's clearly a valid regex?
There isn't anything wrong with your regex.
It is just that the -options_file
options have an odd requirement/expectation (maybe a bug) that the entire option value be wrapped in double quotes, and also that the replacement value be wrapped in single quotes.
https://docs.marklogic.com/guide/mlcp/import#id_42798
The -output_uri_replace option accepts a comma delimited list of regular expression and replacement string pairs. The string portion must be enclosed in single quotes:
I don't think that is very clear that the double quotes wrapping the option are required, and the error doesn't help you understand that.
I noticed that in the source code where that exception is thrown it is looking for single quotes wrapping the replacements value
// Replacement string is expected to be in ''
for (int i = 0; i < replace.length - 1; i++) {
String replacement = replace[++i].trim();
if (!replacement.startsWith("'") ||
!replacement.endsWith("'")) {
throw new IllegalArgumentException(
"Invalid option argument for "
+ OUTPUT_URI_REPLACE + " :" + uriReplace);
}
}
So, I tried wrapping the replacement value in quotes: ^.*(/[^/]+/),$1
But then it complained about a malformed option, and I see it was making calls to OptionsFileUtil.removeQuoteCharactersIfNecessary()
So, then I tried wrapping both values in single quotes: '^.*(/[^/]+/)','$1'
But then it complained about invalid argument and the quotes were goofy:
IllegalArgumentException: Invalid option argument for output_uri_replace :^.*(/[^/]+/)','$1
I also verified this with simple strings foo,bar
and saw same results.
If you wrap the entire value in double quotes and wrap the replacement value in single quotes:
-output_uri_replace
"^.*(/[^/]+/),'$1'"
then I was able to get it to execute without error.
I was also able to pass the -output_uri_replace
via commandline, and it works (as long as you wrap the replacement value with single quotes):
mlcp.bat -options_file options.txt -output_uri_replace "^.*(/[^/]+/),'$1'"