I am trying to use this example of Crate with Common Crawl: https://github.com/crate/crate-commoncrawl
I have setup the Crate and even created the table schema using the instructions from the example.
I am accessing CRATE using the URL: http://localhost:4200/_plugin/crate-admin
as I am working on my own system.
The only issue that I facing is the with the COPY
. Let me show you that line:
COPY commoncrawl FROM 'ccrawl://cr8.is/1WSiodP';
It is triggering unknown exceptions. Here is the error and the trace of the error:
COPY ERROR (0.000 sec)
Error!
SQLActionException[MalformedURLException: unknown protocol: ccrawl]
Error Trace:
SQLActionException: INTERNAL_SERVER_ERROR 5000 MalformedURLException: unknown protocol: ccrawl
at java.net.URL.<init>(URL.java:600)
at java.net.URL.<init>(URL.java:490)
at java.net.URL.<init>(URL.java:439)
at java.net.URI.toURL(URI.java:1089)
at io.crate.operation.collect.files.URLFileInput.getStream(URLFileInput.java:52)
at io.crate.operation.collect.files.FileReadingCollector.readLines(FileReadingCollector.java:228)
at io.crate.operation.collect.files.FileReadingCollector.doCollect(FileReadingCollector.java:205)
at io.crate.operation.collect.MapSideDataCollectOperation$1$1.run(MapSideDataCollectOperation.java:135)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I am using UBUNTU 16.04 operating system. Here is the image of teh error: Kindly, help me. I am not able to understand the problem. do share your thoughts.
Looks like the crate-commoncrawl plugin was not installed correctly. See https://github.com/crate/crate-commoncrawl#build--install.