hivehortonworks-sandboxapache-tez

How do I fix this Kryo exception when using a UDF on hive?


I have a hive query that worked in hortonworks 2.6 sandbox, but it doesn't work on sandbox ver. 3.0 because of this exception:

Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: Encountered unregistered class ID: 95                                                                                                          
Serialization trace:                                                                                                                                                                                               
parentOperators (org.apache.hadoop.hive.ql.exec.vector.reducesink.VectorReduceSinkLongOperator)                                                                                                                    
childOperators (org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator)                                                                                                                                        
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)                                                                                                                                                  
aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)                                                                                                                                                               
        at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:137)                                                                                            
        at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:670)                                                                                                                                 
        at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClass(SerializationUtilities.java:185)  

How do I fix it?

I have seen some answers suggesting doing set hive.exec.parallel=false; but it doesn't work, I still get this error.

I checked the versions of libraries that I use and made sure that hadoop version and hive --version match the versions of libraries that I use in my jar.

I also tried this: https://community.hortonworks.com/content/supportkb/150199/orgapachehivecomesotericsoftwarekryokryoexception-1.html it did not work either.


Solution

  • I was finally able to run my queries after I reduced the size of my udf.jar. It used to be 150 mb and I reduced it to 50 kb. It seems like a kryo bug. I got this info from here: https://github.com/EsotericSoftware/kryo/issues/307

    I reduced the size of my udf.jar by marking all dependencies as provided. So I went from this:

    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-hdfs</artifactId>
        <version>3.1.1</version>
    </dependency>
    

    to this:

    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-hdfs</artifactId>
        <version>3.1.1</version>
        <scope>provided</scope> <!--Notice this line-->
    </dependency>
    

    This is definitely a kryo bug, because I was able to run this query with that large udf.jar file in hortonworks 2.6.

    I hope someone finds this information valuable.