2017-10-25 7 views
1

Ich habe eine Tabelle table1, dann habe ich "create table table2 as select * from table1 where partition_key is not null;", um es zu duplizieren. table1 ist nur 463.2 GB, aber table2 entpuppt sich als 2.8 TB. Warum ist das passiert?duplizierte Bienenstock-Tabelle ist viel größer als die ursprüngliche

PS: Ich habe nur die Partitionen angezeigt und es scheint, dass Tabelle1 und Tabelle2 unterschiedlich partitioniert sind. Also füge ich meiner Frage hinzu: Wie kopiert man eine Tabelle und behält ihre ursprünglichen Partitionierungsinformationen?

tabelle1: hdfs dfs -du -s -h /user/hive/warehouse/map_services.db/userhistory1/*

7.9 G 23.7 G /user/hive/warehouse/map_services.db/userhistory/datestr=1970-01-01 
25.7 G 77.1 G /user/hive/warehouse/map_services.db/userhistory/datestr=2017-10-01 
18.8 G 56.3 G /user/hive/warehouse/map_services.db/userhistory/datestr=2017-10-02 
16.8 G 50.5 G /user/hive/warehouse/map_services.db/userhistory/datestr=2017-10-03 
17.5 G 52.5 G /user/hive/warehouse/map_services.db/userhistory/datestr=2017-10-04 
18.0 G 53.9 G /user/hive/warehouse/map_services.db/userhistory/datestr=2017-10-05 
22.4 G 67.1 G /user/hive/warehouse/map_services.db/userhistory/datestr=2017-10-06 
27.3 G 81.8 G /user/hive/warehouse/map_services.db/userhistory/datestr=2017-10-07 

table2: hdfs dfs -du -s -h /user/hive/warehouse/map_services.db/userhistory2/*

929.2 M 2.7 G /user/hive/warehouse/map_services.db/userhistory2/000000_0 
651.1 M 1.9 G /user/hive/warehouse/map_services.db/userhistory2/000001_0 
1.1 G 3.3 G /user/hive/warehouse/map_services.db/userhistory2/000002_0 
1.1 G 3.3 G /user/hive/warehouse/map_services.db/userhistory2/000003_0 
1.6 G 4.7 G /user/hive/warehouse/map_services.db/userhistory2/000004_0 
1.3 G 4.0 G /user/hive/warehouse/map_services.db/userhistory2/000005_0 
1.2 G 3.7 G /user/hive/warehouse/map_services.db/userhistory2/000006_0 
1.5 G 4.5 G /user/hive/warehouse/map_services.db/userhistory2/000007_0 
1.5 G 4.4 G /user/hive/warehouse/map_services.db/userhistory2/000008_0 
1.5 G 4.4 G /user/hive/warehouse/map_services.db/userhistory2/000009_0 
1.5 G 4.5 G /user/hive/warehouse/map_services.db/userhistory2/000010_0 
1.4 G 4.3 G /user/hive/warehouse/map_services.db/userhistory2/000011_0 
1.4 G 4.3 G /user/hive/warehouse/map_services.db/userhistory2/000012_0 
1.3 G 3.8 G /user/hive/warehouse/map_services.db/userhistory2/000013_0 
1.5 G 4.4 G /user/hive/warehouse/map_services.db/userhistory2/000014_0 
1.4 G 4.2 G /user/hive/warehouse/map_services.db/userhistory2/000015_0 
1.2 G 3.6 G /user/hive/warehouse/map_services.db/userhistory2/000016_0 
1.5 G 4.5 G /user/hive/warehouse/map_services.db/userhistory2/000017_0 
1.5 G 4.4 G /user/hive/warehouse/map_services.db/userhistory2/000018_0 
1.4 G 4.2 G /user/hive/warehouse/map_services.db/userhistory2/000019_0 
1.5 G 4.6 G /user/hive/warehouse/map_services.db/userhistory2/000020_0 
1.5 G 4.5 G /user/hive/warehouse/map_services.db/userhistory2/000021_0 
1.6 G 4.7 G /user/hive/warehouse/map_services.db/userhistory2/000022_0 
1.3 G 4.0 G /user/hive/warehouse/map_services.db/userhistory2/000023_0 
1.1 G 3.4 G /user/hive/warehouse/map_services.db/userhistory2/000024_0 
908.7 M 2.7 G /user/hive/warehouse/map_services.db/userhistory2/000025_0 
1.4 G 4.2 G /user/hive/warehouse/map_services.db/userhistory2/000026_0 
1.4 G 4.3 G /user/hive/warehouse/map_services.db/userhistory2/000027_0 
1.3 G 3.8 G /user/hive/warehouse/map_services.db/userhistory2/000028_0 
1.4 G 4.1 G /user/hive/warehouse/map_services.db/userhistory2/000029_0 
1.6 G 4.7 G /user/hive/warehouse/map_services.db/userhistory2/000030_0 
1.3 G 4.0 G /user/hive/warehouse/map_services.db/userhistory2/000031_0 
1.3 G 4.0 G /user/hive/warehouse/map_services.db/userhistory2/000032_0 
1.6 G 4.8 G /user/hive/warehouse/map_services.db/userhistory2/000033_0 
1.5 G 4.4 G /user/hive/warehouse/map_services.db/userhistory2/000034_0 
1.3 G 3.8 G /user/hive/warehouse/map_services.db/userhistory2/000035_0 
940.0 M 2.8 G /user/hive/warehouse/map_services.db/userhistory2/000036_0 
1.3 G 4.0 G /user/hive/warehouse/map_services.db/userhistory2/000037_0 
1.2 G 3.6 G /user/hive/warehouse/map_services.db/userhistory2/000038_0 
1.5 G 4.6 G /user/hive/warehouse/map_services.db/userhistory2/000039_0 
1.2 G 3.7 G /user/hive/warehouse/map_services.db/userhistory2/000040_0 
1.1 G 3.4 G /user/hive/warehouse/map_services.db/userhistory2/000041_0 
1.1 G 3.4 G /user/hive/warehouse/map_services.db/userhistory2/000042_0 
1.0 G 3.1 G /user/hive/warehouse/map_services.db/userhistory2/000043_0 
1.4 G 4.3 G /user/hive/warehouse/map_services.db/userhistory2/000044_0 
1.3 G 4.0 G /user/hive/warehouse/map_services.db/userhistory2/000045_0 
1.4 G 4.1 G /user/hive/warehouse/map_services.db/userhistory2/000046_0 
1.5 G 4.5 G /user/hive/warehouse/map_services.db/userhistory2/000047_0 
1.1 G 3.3 G /user/hive/warehouse/map_services.db/userhistory2/000048_0 
706.3 M 2.1 G /user/hive/warehouse/map_services.db/userhistory2/000049_0 
1.4 G 4.2 G /user/hive/warehouse/map_services.db/userhistory2/000050_0 
1.5 G 4.6 G /user/hive/warehouse/map_services.db/userhistory2/000051_0 
872.2 M 2.6 G /user/hive/warehouse/map_services.db/userhistory2/000052_0 
1.2 G 3.5 G /user/hive/warehouse/map_services.db/userhistory2/000053_0 
1.2 G 3.7 G /user/hive/warehouse/map_services.db/userhistory2/000054_0 
943.9 M 2.8 G /user/hive/warehouse/map_services.db/userhistory2/000055_0 
1.6 G 4.7 G /user/hive/warehouse/map_services.db/userhistory2/000056_0 
1.5 G 4.4 G /user/hive/warehouse/map_services.db/userhistory2/000057_0 
1.3 G 4.0 G /user/hive/warehouse/map_services.db/userhistory2/000058_0 
1.4 G 4.3 G /user/hive/warehouse/map_services.db/userhistory2/000059_0 
961.5 M 2.8 G /user/hive/warehouse/map_services.db/userhistory2/000060_0 
1.3 G 3.8 G /user/hive/warehouse/map_services.db/userhistory2/000061_0 
1.4 G 4.3 G /user/hive/warehouse/map_services.db/userhistory2/000062_0 
1.4 G 4.2 G /user/hive/warehouse/map_services.db/userhistory2/000063_0 
1.4 G 4.1 G /user/hive/warehouse/map_services.db/userhistory2/000064_0 
924.4 M 2.7 G /user/hive/warehouse/map_services.db/userhistory2/000065_0 

Antwort

1

Ihre Zieltabelle nicht komprimiert und nicht partitioniert.

Zur Tabelle mit derselben Partitionierung verwenden Sie diesen Befehl erstellen:

create table 2 like table1; 

einschalten Kompression vor dem Einsetzen:

SET hive.exec.compress.output=true; 

Sie die dynamischen Partitionen überschreiben:

set hive.exec.dynamic.partition=true; 
set hive.exec.dynamic.partition.mode=nonstrict; 

insert overwrite table2 partition(partition_key) 
select * from table1; 
Verwandte Themen