First off, I am relatively new to hive and druid. I have already set up a hive external table which is connected to a druid datasource. I can query simple SELECTS just fine like. Example:
SELECT id FROM druidtable;
result:
+------------+
| id |
+------------+
| 10001 |
| 10000 |
+------------+
Now I wanted to add an order by id statement. However this results in some sort of connection error?
Stacktrace:
INFO : Compiling command(queryId=hive_20190730090350_28947166-ba7e-418a-bcaa-e548c3bd333d): SELECT id FROM druidtable order by id
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:id, type:string, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20190730090350_28947166-ba7e-418a-bcaa-e548c3bd333d); Time taken: 0.364 seconds
INFO : Executing command(queryId=hive_20190730090350_28947166-ba7e-418a-bcaa-e548c3bd333d): SELECT id FROM druidtable order by id
INFO : Query ID = hive_20190730090350_28947166-ba7e-418a-bcaa-e548c3bd333d
INFO : Total jobs = 1
INFO : Launching Job 1 out of 1
INFO : Starting task [Stage-1:MAPRED] in serial mode
INFO : Subscribed to counters: [] for queryId: hive_20190730090350_28947166-ba7e-418a-bcaa-e548c3bd333d
INFO : Session is already open
INFO : Dag name: SELECT id FROM druidtable...id (Stage-1)
ERROR : Status: Failed
ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1564380041255_0060_17_00, diagnostics=[Vertex vertex_1564380041255_0060_17_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: druidtable initializer failed, vertex=vertex_1564380041255_0060_17_00 [Map 1], java.io.IOException: java.io.IOException: org.apache.hive.druid.org.jboss.netty.channel.ChannelException: Faulty channel in resource pool
at org.apache.hadoop.hive.druid.DruidStorageHandlerUtils.submitRequest(DruidStorageHandlerUtils.java:326)
at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.fetchLocatedSegmentDescriptors(DruidQueryBasedInputFormat.java:262)
at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.distributeScanQuery(DruidQueryBasedInputFormat.java:225)
at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.getInputSplits(DruidQueryBasedInputFormat.java:166)
at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.getSplits(DruidQueryBasedInputFormat.java:100)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:779)
at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hive.druid.org.jboss.netty.channel.ChannelException: Faulty channel in resource pool
at org.apache.hive.druid.com.metamx.http.client.NettyHttpClient.go(NettyHttpClient.java:143)
at org.apache.hive.druid.com.metamx.http.client.AbstractHttpClient.go(AbstractHttpClient.java:14)
at org.apache.hadoop.hive.druid.DruidStorageHandlerUtils.submitRequest(DruidStorageHandlerUtils.java:324)
... 20 more
Caused by: java.net.ConnectException: Connection refused: localhost/127.0.0.1:8082
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hive.druid.org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
at org.apache.hive.druid.org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
at org.apache.hive.druid.org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
at org.apache.hive.druid.org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.apache.hive.druid.org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
at org.apache.hive.druid.org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.apache.hive.druid.org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
... 3 more
at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.fetchLocatedSegmentDescriptors(DruidQueryBasedInputFormat.java:264)
at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.distributeScanQuery(DruidQueryBasedInputFormat.java:225)
at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.getInputSplits(DruidQueryBasedInputFormat.java:166)
at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.getSplits(DruidQueryBasedInputFormat.java:100)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:779)
at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
]
It prints the connection error like 3 times.
Ok i got it now. Somehow a configuration of the session was wrong. The hive.druid.broker.address.default was set to localhost, whereas it should have been the actual ip of the broker.