javaweb-crawlergoogle-crawlersstormcrawler

Unable to install Stormcrawler error with connection refusal port 7071


I am installing Stormcrawler on Ubuntu everything worked but unable to inject the seeds.txt file. when i Run the injector using this command "java -cp target/crawler-1.0-SNAPSHOT.jar crawlercommons.urlfrontier.client.Client PutURLs -f seeds.txt"

Got this error:

$ java -cp  target/crawler-1.0-SNAPSHOT.jar crawlercommons.urlfrontier.client.Client PutURLs -f seeds.txt
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
at io.grpc.Status.asRuntimeException(Status.java:539)
at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:487)
at io.grpc.internal.DelayedClientCall$DelayedListener$3.run(DelayedClientCall.java:471)
at io.grpc.internal.DelayedClientCall$DelayedListener.delayOrExecute(DelayedClientCall.java:435)
at io.grpc.internal.DelayedClientCall$DelayedListener.onClose(DelayedClientCall.java:468)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:563)
at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:744)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:723)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: io.grpc.netty.shaded.io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: localhost/127.0.0.1:7071
Caused by: java.net.ConnectException: finishConnect(..) failed: Connection refused
at io.grpc.netty.shaded.io.netty.channel.unix.Errors.newConnectException0(Errors.java:155)
at io.grpc.netty.shaded.io.netty.channel.unix.Errors.handleConnectErrno(Errors.java:128)
at io.grpc.netty.shaded.io.netty.channel.unix.Socket.finishConnect(Socket.java:321)
at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:710)
at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:687)
at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:567)
at io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:477)
at io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:385)
at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.grpc.netty.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.grpc.netty.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:1583)

I have checked all the services like Kibana and Elasticsearch etc everything is working fine.


Solution

  • You are following the steps to inject into URLFrontier but want to use Elasticsearch.

    Either you go for URLFrontier, in which case you use the 'basic' archetype and follow the steps from its README, or, you use the ElasticSearch archetype and the README it generates.

    Sounds like you ran

    mvn archetype:generate -DarchetypeGroupId=com.digitalpebble.stormcrawler -DarchetypeArtifactId=storm-crawler-archetype -DarchetypeVersion=2.10
    

    but wanted to do

    mvn archetype:generate -DarchetypeGroupId=com.digitalpebble.stormcrawler -DarchetypeArtifactId=storm-crawler-elasticsearch-archetype -DarchetypeVersion=2.10