hivehiveqldruid

How can i order my query result from Hive external Druid table?


First off, I am relatively new to hive and druid. I have already set up a hive external table which is connected to a druid datasource. I can query simple SELECTS just fine like. Example:

SELECT id FROM druidtable;
result:
+------------+
| id         |
+------------+
| 10001      |
| 10000      |
+------------+

Now I wanted to add an order by id statement. However this results in some sort of connection error?

Stacktrace:

INFO  : Compiling command(queryId=hive_20190730090350_28947166-ba7e-418a-bcaa-e548c3bd333d): SELECT id FROM druidtable order by id
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:id, type:string, comment:null)], properties:null)
INFO  : Completed compiling command(queryId=hive_20190730090350_28947166-ba7e-418a-bcaa-e548c3bd333d); Time taken: 0.364 seconds
INFO  : Executing command(queryId=hive_20190730090350_28947166-ba7e-418a-bcaa-e548c3bd333d): SELECT id FROM druidtable order by id
INFO  : Query ID = hive_20190730090350_28947166-ba7e-418a-bcaa-e548c3bd333d
INFO  : Total jobs = 1
INFO  : Launching Job 1 out of 1
INFO  : Starting task [Stage-1:MAPRED] in serial mode
INFO  : Subscribed to counters: [] for queryId: hive_20190730090350_28947166-ba7e-418a-bcaa-e548c3bd333d
INFO  : Session is already open
INFO  : Dag name: SELECT id FROM druidtable...id (Stage-1)
ERROR : Status: Failed
ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1564380041255_0060_17_00, diagnostics=[Vertex vertex_1564380041255_0060_17_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: druidtable initializer failed, vertex=vertex_1564380041255_0060_17_00 [Map 1], java.io.IOException: java.io.IOException: org.apache.hive.druid.org.jboss.netty.channel.ChannelException: Faulty channel in resource pool
        at org.apache.hadoop.hive.druid.DruidStorageHandlerUtils.submitRequest(DruidStorageHandlerUtils.java:326)
        at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.fetchLocatedSegmentDescriptors(DruidQueryBasedInputFormat.java:262)
        at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.distributeScanQuery(DruidQueryBasedInputFormat.java:225)
        at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.getInputSplits(DruidQueryBasedInputFormat.java:166)
        at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.getSplits(DruidQueryBasedInputFormat.java:100)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:779)
        at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
        at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
        at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
        at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hive.druid.org.jboss.netty.channel.ChannelException: Faulty channel in resource pool
        at org.apache.hive.druid.com.metamx.http.client.NettyHttpClient.go(NettyHttpClient.java:143)
        at org.apache.hive.druid.com.metamx.http.client.AbstractHttpClient.go(AbstractHttpClient.java:14)
        at org.apache.hadoop.hive.druid.DruidStorageHandlerUtils.submitRequest(DruidStorageHandlerUtils.java:324)
        ... 20 more
Caused by: java.net.ConnectException: Connection refused: localhost/127.0.0.1:8082
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.hive.druid.org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
        at org.apache.hive.druid.org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
        at org.apache.hive.druid.org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
        at org.apache.hive.druid.org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
        at org.apache.hive.druid.org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
        at org.apache.hive.druid.org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at org.apache.hive.druid.org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        ... 3 more

        at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.fetchLocatedSegmentDescriptors(DruidQueryBasedInputFormat.java:264)
        at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.distributeScanQuery(DruidQueryBasedInputFormat.java:225)
        at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.getInputSplits(DruidQueryBasedInputFormat.java:166)
        at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.getSplits(DruidQueryBasedInputFormat.java:100)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:779)
        at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
        at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
        at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
        at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
]

It prints the connection error like 3 times.


Solution

  • Ok i got it now. Somehow a configuration of the session was wrong. The hive.druid.broker.address.default was set to localhost, whereas it should have been the actual ip of the broker.