2017-02-10 7 views
1

Ich habe eine JSON-Spalte day_data, die Daten im JSON-Format hat. Wie erreiche ich die erwartete Ausgabe mit Hive?Verschachtelte JSON im Bienenstock

Eingang:

{"_id":"1","name":"abc","attribs":[{"minutes":0,"name":"sedentary"},{"minutes":0,"name":"lightly"},{"minutes":0,"name":"fairly"},{"minutes":28,"name":"very"}],"validated":true}

Ausgang: id name attrib_minutes attrib_name validated 1 abc 0 sedentary true 1 abc 0 lightly true 1 abc 0 fairly true 1 abc 28 very true

Ich bin in der Lage ID, Name und validierte Felder get_json_object Befehl, select get_json_object(day_data,'$._id') as id, get_json_object(day_data,'$.name') as name, get_json_object(day_data,'$.validated') as validated from temp_table;

mit extrahieren Wie extrahieren ich die verschachtelte json-Attribute (attrib_minutes und attrib_name)?

Antwort

1
select j.id 
     ,j.name 
     ,get_json_object (day_data,concat('$.attribs[',e.i,'].minutes')) as attrib_minutes 
     ,get_json_object (day_data,concat('$.attribs[',e.i,'].name'))  as attrib_name 
     ,j.validated 

from     temp_table t 
     lateral view json_tuple (day_data,'_id','name','validated') j as id,name,validated 
     lateral view posexplode (split(get_json_object (day_data,'$.attribs[*].name'),'","')) e as i,x 
; 

j.id j.name attrib_minutes attrib_name j.validated 
1 abc 0 sedentary true 
1 abc 0 lightly true 
1 abc 0 fairly true 
1 abc 28 very true