javasqlcratecommon-crawlnosql

Crate Common Crawl Example not working


I am trying to use this example of Crate with Common Crawl: https://github.com/crate/crate-commoncrawl
I have setup the Crate and even created the table schema using the instructions from the example. I am accessing CRATE using the URL: http://localhost:4200/_plugin/crate-adminas I am working on my own system.

The only issue that I facing is the with the COPY. Let me show you that line:

COPY commoncrawl FROM 'ccrawl://cr8.is/1WSiodP';

It is triggering unknown exceptions. Here is the error and the trace of the error:

COPY ERROR (0.000 sec)
Error!

SQLActionException[MalformedURLException: unknown protocol: ccrawl] 

Error Trace:

SQLActionException: INTERNAL_SERVER_ERROR 5000 MalformedURLException: unknown protocol: ccrawl
    at java.net.URL.<init>(URL.java:600)
    at java.net.URL.<init>(URL.java:490)
    at java.net.URL.<init>(URL.java:439)
    at java.net.URI.toURL(URI.java:1089)
    at io.crate.operation.collect.files.URLFileInput.getStream(URLFileInput.java:52)
    at io.crate.operation.collect.files.FileReadingCollector.readLines(FileReadingCollector.java:228)
    at io.crate.operation.collect.files.FileReadingCollector.doCollect(FileReadingCollector.java:205)
    at io.crate.operation.collect.MapSideDataCollectOperation$1$1.run(MapSideDataCollectOperation.java:135)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

I am using UBUNTU 16.04 operating system. Here is the image of teh error: Crate issue image Kindly, help me. I am not able to understand the problem. do share your thoughts.


Solution

  • Looks like the crate-commoncrawl plugin was not installed correctly. See https://github.com/crate/crate-commoncrawl#build--install.