hive 小文件合并
Last updated on November 22, 2024 pm
🧙 Questions
测试小文件合并
☄️ Ideas
hadoop fs -ls /user/hive/warehouse/ispong_db.db/users
-rwxrwxrwt 3 isxcode hive 26 2024-08-31 22:34 /user/hive/warehouse/ispong_db.db/users/part-00000-119d68c8-a1eb-45b3-9b98-bec1c4ad7358-c000
-rwxrwxrwt 3 isxcode hive 10 2024-08-31 22:11 /user/hive/warehouse/ispong_db.db/users/part-00000-19c574c2-e9b5-44aa-8b88-f6ea96067da4-c000
-rwxrwxrwt 3 isxcode hive 10 2024-08-31 22:15 /user/hive/warehouse/ispong_db.db/users/part-00000-92298f3e-61c2-4774-a8ed-bdeb4ae7c9f2-c000
-rwxrwxrwt 3 isxcode hive 20 2024-09-01 13:28 /user/hive/warehouse/ispong_db.db/users/part-00000-961c0d9f-ad37-4d6d-8d4d-9681b86bb949-c000
-rwxrwxrwt 3 isxcode hive 20 2024-08-21 11:31 /user/hive/warehouse/ispong_db.db/users/part-00000-a0c98e57-a1a7-4227-b2dc-51e8966f6fe1-c000
insert into users (username,age) values('ispong',14);
select * from users;
-rwxrwxrwt 3 isxcode hive 10 2024-10-12 18:32 /user/hive/warehouse/ispong_db.db/users/000000_0
Only RCFile and ORCFile Formats are supportted right now.
ALTER TABLE users CONCATENATE;
创建临时表
create table users_result like users;
重新导入一下就压缩了文件
-- SET hive.merge.mapfiles=true;
-- SET hive.merge.mapredfiles=true;
-- SET hive.merge.size.per.task=256000000;
-- SET hive.merge.smallfiles.avgsize=16000000;
INSERT OVERWRITE TABLE users_result SELECT * FROM users;
🔗 Links
hive 小文件合并
https://ispong.isxcode.com/hadoop/hive/hive 小文件合并/