hive 问题
Last updated on August 8, 2025 am
🧙 Questions
- java.net.SocketTimeoutException: Read timed out
- [Permission denied: user=ispong, access=WRITE](#Permission denied: user=ispong, access=WRITE)
- Hive字段备注乱码问题
- 统计数据条数不准
☄️ Ideas
java.net.SocketTimeoutException: Read timed out
org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
解决方案
这是由于jdbc的登录时间设置的太短了
DriverManager.setLoginTimeout(100);
Permission denied: user=ispong, access=WRITE
2021-04-21 11:22:04.478 INFO 2743 --- [io-30104-exec-8] c.d.d.a.m.h.c.HdfsFIieController : uploadHdfsFile==> Permission denied: user=ispong, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:400)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:256)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:194)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1855)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1839)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1798)
at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.resolvePathForStartFile(FSDirWriteFileOp.java:323)
at org.apache.ha
2022-03-22 22:36:48 ERROR:Cannot create staging directory 'hdfs://cdh-master:8020/user/hive/warehouse/ispong_test02.db/ods_s_xy_app_count/.hive-staging_hive_2022-03-22_22-30-01_832_8923653986493560028-6': Permission denied: user=root, access=WRITE, inode="/user/hive/warehouse/ispong_test02.db/ods_s_xy_app_count":ispong:hive:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:400)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:256)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:194)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1855)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1839)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1798)
at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:60)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3101)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1123)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:696)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
2022-03-22 22:36:48 ERROR Current task status: ERROR
解决方案
因为hive的用户依赖于hdfs,hdfs的权限依赖于Linux系统权限。
supergroup是linux默认用户组,但现在系统默认自动创建。
此问题,可以通过新建supergroup用户组,并将用户添加到用户组的方式解决。
# 创建supergroup用户组
sudo groupadd supergroup
# 将ispong用户添加到supergroup用户组中
sudo usermod -a -G supergroup yarn
# 刷新supergroup用户组
newgrp supergroup
统计数据条数不准
使用desc formatted 查询的表条数,和表的大小不一致,hive返回的条数不准确
解决方案
-- 使用analyze解析一下表,不支持分区表
analyze table users compute statistics;
脏数据
定位数据位置
- 看日志 大多数是时间错位 数字错位
- 二分法,找到错误数据
^@ NU 0x00 0 NULL (NUL)N \u0000
^A SH 0x01 1 START OF HEADING (SOH) \u0001
^B SX 0x02 2 START OF TEXT (STX) \u0002
^C EX 0x03 3 END OF TEXT (ETX) \u0003
^D ET 0x04 4 END OF TRANSMISSION (EOT) \u0004
^E EQ 0x05 5 ENQUIRY (ENQ) \u0005
^F AK 0x06 6 ACKNOWLEDGE (ACK) \u0006
^G BL 0x07 7 BELL (BEL) \u0007
^H BS 0x08 8 BACKSPACE (BS) \u0008
^I HT 0x09 9 CHARACTER TABULATION (HT) \u0009
^@ LF 0x0a 10 LINE FEED (LF) \u0010
^K VT 0x0b 11 LINE TABULATION (VT) \u0011
^L FF 0x0c 12 FORM FEED (FF) \u0012
^M CR 0x0d 13 CARRIAGE RETURN (CR) \u0013
<!-- 脏数据 -->
162554448^B18757471839^BLJ12FKT36M4012088^B21322337L22230TB2021301SEG^B1^B2022-03-29 23:59:54^B^B\N
162554062^B15168374597^BLJ11KBBD6L1330035^B20A1379^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^B1^B2022-03-29 23:52:27^B^B\N
创建错误数据表
drop table ads_t_user_record_define;
CREATE TABLE IF NOT EXISTS ads_t_user_record_define(
id INT comment '主键ID',
user_name STRING comment '用户名',
vin STRING comment '登录所在车VIN',
sn STRING comment '登录所在车MP5的SN',
user_type INT comment '用户类型',
login_time TIMESTAMP comment '登录时间',
content STRING comment '内容',
solution STRING comment '解决方法'
) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\002' LINES TERMINATED BY '\n' NULL DEFINED AS '\\N' STORED AS TEXTFILE
-- # 导入错误数据
insert overwrite table ads_t_user_record_define
select id,user_name,vin,sn,user_type,login_time,content,solution from ads_t_user_record t
where t.id in ('162554448','162554062')
# 下载数据
hadoop fs -get /user/hive/warehouse/ispong_test.db/ads_t_user_record_define/part-00000-4878ea6d-62e3-4bb8-98e6-a0b2757f3957-c000 ~/
select regexp_replace(sn,'\u0000','') from ads_t_user_record_define limit 10
#####
Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the "BONECP" plugin to create a ConnectionPool gave an error : The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver. at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:232) at org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:117) at org.datanucleus.store.rdbms.ConnectionFactoryImpl.<init>(ConnectionFactoryImpl.java:82) ... 44 more
Caused by: org.datanucleus.store.rdbms.connectionpool.DatastoreDriverNotFoundException: The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.
at org.datanucleus.store.rdbms.connectionpool.AbstractConnectionPoolFactory.loadDriver(AbstractConnectionPoolFactory.java:58) ~[datanucleus-rdbms-4.1.7.jar:?]
at org.datanucleus.store.rdbms.connectionpool.BoneCPConnectionPoolFactory.createConnectionPool(BoneCPConnectionPoolFactory.java:54) ~[datanucleus-rdbms-4.1.7.jar:?]
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:213) ~[datanucleus-rdbms-4.1.7.jar:?]
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:117) ~[datanucleus-rdbms-4.1.7.jar:?]
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.<init>(ConnectionFactoryImpl.java:82) ~[datanucleus-rdbms-4.1.7.jar:?]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_181]
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_181]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_181]
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_181]
at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:606) ~[datanucleus-core-4.1.6.jar:?]
at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:330) ~[datanucleus-core-4.1.6.jar:?]
at org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:203) ~[datanucleus-core-4.1.6.jar:?]
at org.datanucleus.store.AbstractStoreManager.<init>(AbstractStoreManager.java:162) ~[datanucleus-core-4.1.6.jar:?]
at org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:284) ~[datanucleus-rdbms-4.1.7.jar:?]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_181]
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_181]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_181]
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_181]
at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:606) ~[datanucleus-core-4.1.6.jar:?]
at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301) ~[datanucleus-core-4.1.6.jar:?]
at org.datanucleus.NucleusContextHelper.createStoreManagerForProperties(NucleusContextHelper.java:133) ~[datanucleus-core-4.1.6.jar:?]
at org.datanucleus.PersistenceNucleusContextImpl.initialise(PersistenceNucleusContextImpl.java:420) ~[datanucleus-core-4.1.6.jar:?]
at org.datanucleus.api.jdo.JD
# 看日志,找到指定的服务器,添加驱动
scp -P 62222 dcloud@dehoop-cdh1:/usr/share/java/mysql-connector-java.jar /tmp
sudo cp /tmp/mysql-connector-java.jar /usr/share/java/mysql-connector-java.jar
sudo chmod 777 /usr/share/java/mysql-connector-java.jar
hiveMetaService 启动异常
25/08/05 09:51:37 ERROR metastore.HiveMetaStore: [main]: MetaException(message:Version information not found in metastore. )
at org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:8066)
at org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:8043)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101)
at com.sun.proxy.$Proxy26.verifySchema(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf(HiveMetaStore.java:654)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:647)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:716)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:419)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:78)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:84)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:7028)
at org.apache.hadoop.hive.metast
USE hive_metastore;
select * from VERSION;
INSERT INTO VERSION (VER_ID, SCHEMA_VERSION, VERSION_COMMENT, SCHEMA_LEVEL)
VALUES (1, '4.0.0', 'Hive release version 4.0.0', 0);