公司用的CDH是6.1.1版本的,但是因为这个版本的impala无法自动刷hive的catlog,所以我把hive升级到了6.3.2版本的hive2.1.1了,使用mr引擎没有问题,但是我使用spark的时候就报错了
23/11/20 11:39:28 ERROR rpc.RpcDispatcher: [Remote Spark Driver to HiveServer2 Connection] Received error message: io.netty.handler.codec.DecoderException: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IndexOutOfBoundsException: Index: 109, Size: 6
Serialization trace:
shuffleWriteMetrics (org.apache.hive.spark.client.metrics.Metrics)
metrics (org.apache.hive.spark.client.BaseProtocol$JobMetrics)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:459)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265)
at io.netty.handler.codec.ByteToMessageCodec.channelRead(ByteToMessageCodec.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:138)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IndexOutOfBoundsException: Index: 109, Size: 6
Serialization trace:
shuffleWriteMetrics (org.apache.hive.spark.client.metrics.Metrics)
metrics (org.apache.hive.spark.client.BaseProtocol$JobMetrics)
at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:144)
at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731)
at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:813)
at org.apache.hive.spark.client.rpc.KryoMessageCodec.decode(KryoMessageCodec.java:103)
at io.netty.handler.codec.ByteToMessageCodec$1.decode(ByteToMessageCodec.java:42)
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428)
... 16 more
Caused by: java.lang.IndexOutOfBoundsException: Index: 109, Size: 6
at java.util.ArrayList.rangeCheck(ArrayList.java:657)
at java.util.ArrayList.get(ArrayList.java:433)
at org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:60)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:857)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:729)
at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
... 25 more
.
23/11/20 11:39:28 WARN client.RemoteDriver: Shutting down driver because Remote Spark Driver to HiveServer2 connection was closed.
23/11/20 11:39:28 INFO client.RemoteDriver: Shutting down Spark Remote Driver.
23/11/20 11:39:28 INFO scheduler.DAGScheduler: Asked to cancel job 0
23/11/20 11:39:28 ERROR client.RemoteDriver: Failed to run client job 0b078ec8-394e-4c6d-b3ba-15b76a1a0457
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:206)
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:222)
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:157)
at org.apache.spark.SimpleFutureAction.ready(FutureAction.scala:173)
at org.apache.spark.SimpleFutureAction.ready(FutureAction.scala:162)
at org.apache.spark.util.ThreadUtils$.awaitReady(ThreadUtils.scala:243)
at org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:329)
at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:342)
at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:404)
at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:365)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
23/11/20 11:39:28 INFO cluster.YarnClusterScheduler: Cancelling stage 1
23/11/20 11:39:28 INFO cluster.YarnClusterScheduler: Killing all running tasks in stage 1: Stage cancelled
现象1.select * from test查询的时候是没有问题。
2.使用简单的count就会出现这种问题。
分析:出现这种原因是因为我升级了hive的版本后,版本与版本函数不兼容导致的。
解决办法:
以下所有的节点都要这么操作
进入spark的目录
[root@sztcyl-inte-hadoop-node01 hive]# cd /opt/cloudera/parcels/CDH/lib/spark/hive
查看hive的版本
[root@sztcyl-inte-hadoop-node01 hive]# ll
total 34972
-rw-r--r-- 1 root root 35807551 Nov 20 13:54 hive-exec-2.1.1-cdh6.1.1.jar
复制CDH 6.3.2的hive版本
[root@sztcyl-inte-hadoop-node01 hive]# cp /opt/cloudera/parcels/CDH-6.1.1-1.cdh6.1.1.p0.875250/lib/hive/lib632/hive-exec-2.1.1-cdh6.3.2.jar /opt/cloudera/parcels/CDH/lib/spark/hive
移走CDH6.1.1的hive版本
[root@sztcyl-inte-hadoop-node01 hive]# mv hive-exec-2.1.1-cdh6.1.1.jar /opt/hivebak/
重启spark节点
在次查询问题解决: