hive 小文件合并

Last updated on October 22, 2024 am

🧙 Questions

测试小文件合并

☄️ Ideas

hadoop fs -ls /user/hive/warehouse/ispong_db.db/users
-rwxrwxrwt   3 dehoop hive         26 2024-08-31 22:34 /user/hive/warehouse/ispong_db.db/users/part-00000-119d68c8-a1eb-45b3-9b98-bec1c4ad7358-c000
-rwxrwxrwt   3 dehoop hive         10 2024-08-31 22:11 /user/hive/warehouse/ispong_db.db/users/part-00000-19c574c2-e9b5-44aa-8b88-f6ea96067da4-c000
-rwxrwxrwt   3 dehoop hive         10 2024-08-31 22:15 /user/hive/warehouse/ispong_db.db/users/part-00000-92298f3e-61c2-4774-a8ed-bdeb4ae7c9f2-c000
-rwxrwxrwt   3 dehoop hive         20 2024-09-01 13:28 /user/hive/warehouse/ispong_db.db/users/part-00000-961c0d9f-ad37-4d6d-8d4d-9681b86bb949-c000
-rwxrwxrwt   3 dehoop hive         20 2024-08-21 11:31 /user/hive/warehouse/ispong_db.db/users/part-00000-a0c98e57-a1a7-4227-b2dc-51e8966f6fe1-c000
insert into users (username,age) values('ispong',14);
select * from users;
-rwxrwxrwt   3 dehoop hive         10 2024-10-12 18:32 /user/hive/warehouse/ispong_db.db/users/000000_0

Only RCFile and ORCFile Formats are supportted right now.

ALTER TABLE users CONCATENATE;

创建临时表

create table users_result like users;

重新导入一下就压缩了文件

-- SET hive.merge.mapfiles=true;
-- SET hive.merge.mapredfiles=true;
-- SET hive.merge.size.per.task=256000000;
-- SET hive.merge.smallfiles.avgsize=16000000;
INSERT OVERWRITE TABLE users_result SELECT * FROM users;

hive 小文件合并
https://ispong.isxcode.com/hadoop/hive/hive 小文件合并/
Author
ispong
Posted on
October 12, 2024
Licensed under