2017-07-25 3 views
1

Ich habe Daten im FormatHive Logik min Zeit, max Zeit und andere Spalten

+---------------------+-------------------------+-------------------------+-----------+------+ 
|   id   |  start time  |  end time   | direction | name | 
+---------------------+-------------------------+-------------------------+-----------+------+ 
| 9202340753368000000 | 2015-06-02 15:10:28.677 | 2015-06-02 15:32:22.677 |   3 | xyz | 
| 9202340753368000000 | 2015-06-02 14:55:37.353 | 2015-06-02 15:12:18.84 |   1 | xyz | 
+---------------------+-------------------------+-------------------------+-----------+------+ 

und ich brauche eine Ausgabe wie minimale Startzeit, maximale Endzeit, Richtungswert für min Startzeit und Namen zu bekommen

+---------------------+-------------------------+------------------------+-----------+------+ 
|   id   |  start time  |  end time  | direction | name | 
+---------------------+-------------------------+------------------------+-----------+------+ 
| 9202340753368000000 | 2015-06-02 14:55:37.353 | 2015-06-02 15:32:22.677|   1 | xyz | 
+---------------------+-------------------------+------------------------+-----------+------+ 

ich habe versucht,

select x.id, min(x.start_time) as mintime, max(x.end_time) maxtime , y.direction, y.name 
from dir_samp x inner join ( 
select id, start_time, end_time, name, direction , 
    rank() over (partition by id 
       order by start_time asc) as r 
    from dir_samp 
) y on x.id = y.id where y.r = 1 group by x.id , y.direction, y.name 

Verwendung Wenn eine andere effizientere Logik ist da? Geben Sie bitte.

Dank

Antwort

1
select  id 
      ,min_vals.start_time 
      ,end_time 
      ,min_vals.direction 
      ,min_vals.name 

from  (select  id 
         ,min(named_struct('start_time',start_time,'direction',direction,'name',name)) as min_vals 
         ,max(end_time)                as end_time 

      from  dir_samp 

      group by id 
      ) t 
; 

+---------------------+----------------------------+----------------------------+-----------+------+ 
| id     | start_time     | end_time     | direction | name | 
+---------------------+----------------------------+----------------------------+-----------+------+ 
| 9202340753368000000 | 2015-06-02 14:55:37.353000 | 2015-06-02 15:32:22.677000 | 1   | xyz | 
+---------------------+----------------------------+----------------------------+-----------+------+ 
1

Sie nicht beitreten Innen brauchen:

select y.id, min(y.start_time) as mintime, 
     max(y.end_time) maxtime , 
     max(case when y.r=1 then y.direction end) as direction, 
     max(case when y.r=1 then y.name end) as name 
from 
( 
select id, start_time, end_time, name, direction , 
    rank() over (partition by id order by start_time asc) as r 
    from dir_samp 
) y 
group by y.id;