2016-04-22 12 views
0

Ich habe 32.000 Spalten, einige der Ansichten enthält bis zu Millionen Zeilen, kann mehr sein. @ulrich von teradata forum zur Verfügung gestellt fast nette Lösung. Das Hauptziel besteht darin, eine flüchtige Tabelle zu erstellen, und dann durch dynamisches sql alle erforderlichen Informationen einzufügen. Hier ist eine voll modifizierte Lösung:Wie bekomme ich eine Liste von Datentypen von View Columns?

Allerdings kann ich diese Lösung nicht verwenden, ich konfrontiert Spool-Problem. Das Problem ist, dass Abfrage: select type(databasename.tableName.columName) gibt den Typ für Spalte n mal zurück, da n die Anzahl der Zeilen ist. Mit distinct oder group mit 1 (genauso, weil TD14 es alleine wählen kann).

Wird nach 4 Jahren in TD v. 14.1 etwas geändert?

UPD1

explain insert into view_column_data_type Select distinct'db1','tb1','col1',type(db1.tb1.col1); 

    1) First, we lock db1.o in view tb1 for access, 
    we lock db1.a in view tb1 for access, we 
    lock db1.o in view tb1 for access, and we 
    lock db1.a in view tb1 for access. 
    2) Next, we execute the following steps in parallel. 
     1) We do an all-AMPs RETRIEVE step from db1.o in view 
      tb1 by way of an all-rows scan with no residual 
      conditions into Spool 11 (all_amps), which is redistributed 
      by the hash code of (db1.o.GUID) to all AMPs. The 
      size of Spool 11 is estimated with low confidence to be 
      74,480 rows (66,659,600 bytes). The estimated time for this 
      step is 0.13 seconds. 
     2) We do an all-AMPs RETRIEVE step from db1.a in view 
      tb1 by way of an all-rows scan with no residual 
      conditions into Spool 12 (all_amps), which is redistributed 
      by the hash code of (db1.a.GUID) to all AMPs. The 
      size of Spool 12 is estimated with low confidence to be 280 
      rows (256,200 bytes). The estimated time for this step is 
      0.13 seconds. 
    3) We do an all-AMPs JOIN step from Spool 11 (Last Use) by way of an 
    all-rows scan, which is joined to Spool 12 (Last Use) by way of an 
    all-rows scan. Spool 11 and Spool 12 are full outer joined using 
    a single partition hash join, with condition(s) used for 
    non-matching on right table ("NOT (GUID IS NULL)"), with a join 
    condition of ("GUID = GUID"). The result goes into Spool 10 
    (all_amps), which is built locally on the AMPs. The size of Spool 
    10 is estimated with low confidence to be 74,759 rows (
    134,491,441 bytes). The estimated time for this step is 0.84 
    seconds. 
    4) We do an all-AMPs STAT FUNCTION step from Spool 10 (Last Use) by 
    way of an all-rows scan into Spool 17 (Last Use), which is assumed 
    to be redistributed by value to all AMPs. The result rows are put 
    into Spool 15 (all_amps), which is built locally on the AMPs. The 
    size is estimated with low confidence to be 74,759 rows (
    72,890,025 bytes). 
    5) We do an all-AMPs STAT FUNCTION step from Spool 15 (Last Use) by 
    way of an all-rows scan into Spool 20 (Last Use), which is 
    redistributed by hash code to all AMPs. The result rows are put 
    into Spool 19 (all_amps), which is built locally on the AMPs. The 
    size is estimated with low confidence to be 74,759 rows (
    71,693,881 bytes). 
    6) We execute the following steps in parallel. 
     1) We do an all-AMPs RETRIEVE step from Spool 19 (Last Use) by 
      way of an all-rows scan with a condition of ("(Field_20 <> 
      'D') OR (Field_21 = 1)") into Spool 9 (used to materialize 
      view, derived table, table function or table operator t3) 
      (all_amps), which is built locally on the AMPs. The size of 
      Spool 9 is estimated with low confidence to be 74,759 rows (
      69,600,629 bytes). The estimated time for this step is 4.66 
      seconds. 
     2) We do an all-AMPs RETRIEVE step from db1.o in view 
      tb1 by way of an all-rows scan with no residual 
      conditions into Spool 24 (all_amps), which is redistributed 
      by the hash code of (db1.o.MK) to all AMPs. Then 
      we do a SORT to order Spool 24 by row hash. The size of 
      Spool 24 is estimated with low confidence to be 280 rows (
      116,200 bytes). 
    7) We do an all-AMPs RETRIEVE step from Spool 24 by way of an 
    all-rows scan into Spool 25 (all_amps), which is duplicated on all 
    AMPs. The size of Spool 25 is estimated with low confidence to be 
    78,400 rows (32,536,000 bytes). The estimated time for this step 
    is 0.02 seconds. 
    8) We do an all-AMPs JOIN step from db1.a in view 
    tb1 by way of an all-rows scan with no residual 
    conditions, which is joined to Spool 25 (Last Use) by way of an 
    all-rows scan. db1.a and Spool 25 are left outer 
    joined using a product join, with condition(s) used for 
    non-matching on left table ("NOT (db1.a.GUID IS NULL)"), 
    with a join condition of ("GUID = db1.a.GUID"). The 
    result goes into Spool 26 (all_amps), which is redistributed by 
    the hash code of (db1.o.MK) to all AMPs. Then we do a 
    SORT to order Spool 26 by row hash. The size of Spool 26 is 
    estimated with low confidence to be 559 rows (245,401 bytes). 
    9) We do an all-AMPs JOIN step from Spool 26 (Last Use) by way of a 
    RowHash match scan, which is joined to Spool 24 (Last Use) by way 
    of a RowHash match scan. Spool 26 and Spool 24 are full outer 
    joined using a merge join, with a join condition of ("Field_1 = 
    Field_1"). The result goes into Spool 23 (all_amps), which is 
    built locally on the AMPs. The size of Spool 23 is estimated with 
    low confidence to be 559 rows (463,411 bytes). The estimated time 
    for this step is 0.03 seconds. 
10) We do an all-AMPs STAT FUNCTION step from Spool 23 (Last Use) by 
    way of an all-rows scan into Spool 31 (Last Use), which is 
    redistributed by hash code to all AMPs. The result rows are put 
    into Spool 29 (all_amps), which is built locally on the AMPs. The 
    size is estimated with low confidence to be 559 rows (273,910 
    bytes). 
11) We do an all-AMPs STAT FUNCTION step from Spool 29 (Last Use) by 
    way of an all-rows scan into Spool 34 (Last Use), which is 
    redistributed by hash code to all AMPs. The result rows are put 
    into Spool 33 (all_amps), which is built locally on the AMPs. The 
    size is estimated with low confidence to be 559 rows (264,966 
    bytes). 
12) We execute the following steps in parallel. 
     1) We do an all-AMPs RETRIEVE step from Spool 33 (Last Use) by 
     way of an all-rows scan with a condition of ("(Field_12 <> 
     'D') OR (Field_13 = 1)") into Spool 8 (used to materialize 
     view, derived table, table function or table operator t2) 
     (all_amps), which is built locally on the AMPs. The size of 
     Spool 8 is estimated with low confidence to be 559 rows (
     249,314 bytes). The estimated time for this step is 0.01 
     seconds. 
     2) We do an all-AMPs RETRIEVE step from db1.o in view 
     tb1 by way of an all-rows scan with no residual 
     conditions locking for access into Spool 51 (all_amps), which 
     is redistributed by the hash code of (db1.o.GUID) 
     to all AMPs. Then we do a SORT to order Spool 51 by row hash. 
     The size of Spool 51 is estimated with low confidence to be 
     74,480 rows (1,564,080 bytes). The estimated time for this 
     step is 0.06 seconds. 
     3) We do an all-AMPs RETRIEVE step from db1.a in view 
     tb1 by way of an all-rows scan with no residual 
     conditions locking for access into Spool 52 (all_amps), which 
     is redistributed by the hash code of (db1.a.GUID) 
     to all AMPs. Then we do a SORT to order Spool 52 by row hash. 
     The size of Spool 52 is estimated with low confidence to be 
     280 rows (9,240 bytes). The estimated time for this step is 
     0.06 seconds. 
13) We do an all-AMPs JOIN step from Spool 51 (Last Use) by way of a 
    RowHash match scan, which is joined to Spool 52 (Last Use) by way 
    of a RowHash match scan. Spool 51 and Spool 52 are full outer 
    joined using a merge join, with condition(s) used for non-matching 
    on right table ("NOT (GUID IS NULL)"), with a join condition of (
    "GUID = GUID"). The result goes into Spool 50 (all_amps), which 
    is built locally on the AMPs. The size of Spool 50 is estimated 
    with low confidence to be 74,759 rows (3,214,637 bytes). The 
    estimated time for this step is 0.07 seconds. 
14) We do an all-AMPs STAT FUNCTION step from Spool 50 (Last Use) by 
    way of an all-rows scan into Spool 57 (Last Use), which is assumed 
    to be redistributed by value to all AMPs. The result rows are put 
    into Spool 55 (all_amps), which is built locally on the AMPs. The 
    size is estimated with low confidence to be 74,759 rows (
    6,952,587 bytes). 
15) We do an all-AMPs STAT FUNCTION step from Spool 55 (Last Use) by 
    way of an all-rows scan into Spool 60 (Last Use), which is 
    redistributed by hash code to all AMPs. The result rows are put 
    into Spool 5 (all_amps), which is redistributed by hash code to 
    all AMPs. The size is estimated with low confidence to be 74,759 
    rows (5,457,407 bytes). 
16) We do an all-AMPs RETRIEVE step from Spool 8 by way of an all-rows 
    scan with a condition of ("(t2.RDM$END_DATE <= TIMESTAMP 
    '9999-12-31 00:00:00.000000') AND ((t2.col1 > TIMESTAMP 
    '1900-01-01 00:00:00.000000') AND (NOT (t2.MK IS NULL)))") into 
    Spool 90 (all_amps), which is duplicated on all AMPs. The size of 
    Spool 90 is estimated with low confidence to be 156,520 rows (
    5,791,240 bytes). The estimated time for this step is 0.02 
    seconds. 
17) We do an all-AMPs JOIN step from Spool 90 (Last Use) by way of an 
    all-rows scan, which is joined to Spool 9 by way of an all-rows 
    scan. Spool 90 and Spool 9 are joined using a dynamic hash join, 
    with a join condition of ("(LVL_TYPE_MK = MK) AND ((col1 
    ,RDM$END_DATE) OVERLAPS (col1 ,RDM$END_DATE))"). The 
    result goes into Spool 5 (all_amps), which is redistributed by the 
    hash code of ((CASE WHEN ((RDM$OPC = 'D') OR 
    (db1.a.RDM$VALIDFROM IS NULL)) THEN (TIMESTAMP 
    '1900-01-01 00:00:00.000000') ELSE (db1.a.RDM$VALIDFROM) 
    END), TIMESTAMP '9999-12-31 00:00:00.000000', (CASE WHEN 
    (db1.a.GUID IS NULL) THEN (db1.o.GUID) ELSE 
    (db1.a.GUID) END)) to all AMPs. The size of Spool 5 is 
    estimated with no confidence to be 227,602 rows (16,614,946 bytes). 
    The estimated time for this step is 0.19 seconds. 
18) We execute the following steps in parallel. 
     1) We do an all-AMPs RETRIEVE step from Spool 9 by way of an 
     all-rows scan with a condition of ("NOT (t1.MK_SUCCESSOR IS 
     NULL)") into Spool 117 (all_amps) fanned out into 7 hash join 
     partitions, which is built locally on the AMPs. The size of 
     Spool 117 is estimated with low confidence to be 74,759 rows (
     3,364,155 bytes). The estimated time for this step is 0.30 
     seconds. 
     2) We do an all-AMPs RETRIEVE step from Spool 9 by way of an 
     all-rows scan with a condition of ("(t3.RDM$END_DATE <= 
     TIMESTAMP '9999-12-31 00:00:00.000000') AND 
     ((t3.col1 > TIMESTAMP '1900-01-01 00:00:00.000000') 
     AND (NOT (t3.MK IS NULL)))") into Spool 118 (all_amps) fanned 
     out into 7 hash join partitions, which is duplicated on all 
     AMPs. The result spool file will not be cached in memory. 
     The size of Spool 118 is estimated with low confidence to be 
     20,932,520 rows (774,503,240 bytes). The estimated time for 
     this step is 0.42 seconds. 
19) We do an all-AMPs JOIN step from Spool 117 (Last Use) by way of an 
    all-rows scan, which is joined to Spool 118 (Last Use) by way of 
    an all-rows scan. Spool 117 and Spool 118 are joined using a hash 
    join of 7 partitions, with a join condition of ("(MK_SUCCESSOR = 
    MK) AND ((col1 ,RDM$END_DATE) OVERLAPS (col1 
    ,RDM$END_DATE))"). The result goes into Spool 5 (all_amps), which 
    is redistributed by the hash code of ((CASE WHEN ((RDM$OPC = 'D') 
    OR (db1.a.RDM$VALIDFROM IS NULL)) THEN (TIMESTAMP 
    '1900-01-01 00:00:00.000000') ELSE (db1.a.RDM$VALIDFROM) 
    END), TIMESTAMP '9999-12-31 00:00:00.000000', (CASE WHEN 
    (db1.a.GUID IS NULL) THEN (db1.o.GUID) ELSE 
    (db1.a.GUID) END)) to all AMPs. Then we do a SORT to 
    order Spool 5 by the sort key in spool field1 eliminating 
    duplicate rows. The size of Spool 5 is estimated with no 
    confidence to be 98,165 rows (7,166,045 bytes). The estimated 
    time for this step is 2.83 seconds. 
20) We do an all-AMPs STAT FUNCTION step from Spool 5 (Last Use) by 
    way of an all-rows scan into Spool 122 (Last Use), which is 
    assumed to be redistributed by value to all AMPs. The result rows 
    are put into Spool 120 (all_amps), which is built locally on the 
    AMPs. The size is estimated with no confidence to be 98,165 rows 
    (6,577,055 bytes). The estimated time for this step is 0.01 
    seconds. 
21) We execute the following steps in parallel. 
     1) We do an all-AMPs RETRIEVE step from Spool 120 (Last Use) by 
     way of an all-rows scan into Spool 6 (used to materialize 
     view, derived table, table function or table operator vv) 
     (all_amps), which is built locally on the AMPs. The size of 
     Spool 6 is estimated with no confidence to be 98,165 rows (
     4,024,765 bytes). The estimated time for this step is 0.01 
     seconds. 
     2) We do an all-AMPs RETRIEVE step from Spool 9 (Last Use) by way 
     of an all-rows scan into Spool 126 (all_amps), which is 
     duplicated on all AMPs. The result spool file will not be 
     cached in memory. The size of Spool 126 is estimated with low 
     confidence to be 20,932,520 rows. The estimated time for this 
     step is 0.21 seconds. 
22) We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of an 
    all-rows scan, which is joined to Spool 126 by way of an all-rows 
    scan. Spool 6 and Spool 126 are joined using a product join, with 
    a join condition of ("(1=1)"). The result goes into Spool 128 
    (all_amps), which is built locally on the AMPs. The result spool 
    file will not be cached in memory. The size of Spool 128 is 
    estimated with no confidence to be 7,338,717,235 rows. The 
    estimated time for this step is 42.98 seconds. 
23) We do an all-AMPs JOIN step from Spool 8 (Last Use) by way of an 
    all-rows scan, which is joined to Spool 126 (Last Use) by way of 
    an all-rows scan. Spool 8 and Spool 126 are joined using a 
    product join, with a join condition of ("(1=1)"). The result goes 
    into Spool 129 (all_amps), which is duplicated on all AMPs. The 
    result spool file will not be cached in memory. The size of Spool 
    129 is estimated with low confidence to be 11,701,278,680 rows. 
    The estimated time for this step is 57.75 seconds. 
24) We do an all-AMPs JOIN step from Spool 128 (Last Use) by way of an 
    all-rows scan, which is joined to Spool 129 (Last Use) by way of 
    an all-rows scan. Spool 128 and Spool 129 are joined using a 
    product join, with a join condition of ("(1=1)"). The result goes 
    into Spool 125 (one-amp), which is redistributed by the hash code 
    of ('db1', 'tb1') to all AMPs. The result 
    spool file will not be cached in memory. The size of Spool 125 is 
    estimated with no confidence to be *** rows (*** bytes). The 
    estimated time for this step is 1,820,312 hours and 14 minutes. 
25) We do a single-AMP SORT to order Spool 125 (one-amp) by eliminate 
    duplicate rows. 
26) We do a single-AMP MERGE into 
    "admin".view_column_data_type from Spool 125 (Last Use). 
    The size is estimated with no confidence to be *** rows. The 
    estimated time for this step is 881,263,274 hours and 32 minutes. 
27) We spoil the parser's dictionary cache for the table. 
28) Finally, we send out an END TRANSACTION step to all AMPs involved 
    in processing the request. 
    -> No rows are returned to the user as the result of statement 1. 
+0

Entschuldigung für so lange Expain-Anweisung ( – Rocketq

+0

Scheint, wie dieser Ansatz fehlschlägt, wenn es eine OLAP-Funktion gibt und zurück zur tatsächlichen Ausführung der Abfrage zurückkehrt. Aber diese Erklärung zeigt drei Produkt Joins, werden sie auch angezeigt, wenn Sie Erklären Sie eine einfache 'SELECT * FROM Ansicht'? – dnoeth

+0

@dnoeth ja, Sie haben Recht – Rocketq

Antwort

0

Ich bin mir nicht sicher, ob es eine mögliche Lösung gibt, die nur sql verwendet. Meine letzte Lösung verwendet BTEQ (nice guide wie es zu benutzen) Liste der Spalten und Tabellen zu erhalten, indem man zunächst das Schreiben Datei dynamisch generierten SQL-Abfrage Schreiben in:

select 'select ' !! Trim(databasename) !! '.'!!Trim(tablename) !! '; ' !! 
    'help column ' !! Trim(databasename) !! '.'!!Trim(tablename) !! '.* ;' 
from dbc.columnsV 
where (databasename, tablename) in (select databasename, tablename from dbc.tablesV as tb where tb.tableKind = 'V' 
    and TRIM(tb.DatabaseName) IN ('db1', 'db2')) 
; 

Abfrage obige Tabelle Namen plus Hilfe Spalte Ergebnis generiert.

generierte CSV-Datei dann von jeder Sprache analysiert werden könnte, zum Beispiel in Python 2.7:

import pandas as pd 
df = pd.read_csv('out.csv',sep = ';',) 

df_logs = pd.DataFrame([]) 
for i in range(len(df)): 
    if i% 1000 == 0: 
     print i 
    if df['Column'].iloc[i][:5] == 'sit50': 
     full_name = df['Column'].iloc[i] 
     j = 3 
     while df['Column'].iloc[i+j][:5] != "'db_template": 
      if i+j == len(df) - 1: 
       break 
      df_logs = df_logs.append([[full_name + ' ' + df['Column'].iloc[i+j],df['Name'].iloc[i+j]]], ignore_index= True) 
      j = j + 1 
     i = i + j 
df_logs.to_csv("db_logs", sep='\t') 

Hoffnung, wird diese Lösung jemanden helfen.

1

Es sollte keine Spool-Problem sein, weil diese einige alte Tequel nutzt (= Pre-SQL) Syntax in den Optimierer, ohne die Sicht Quellcode bis hinunter zu den Basistabellen löst tatsächlich Zugriff auf Sie.

Wenn Sie erklären

insert into view_column_data_type 
SELECT TYPE(DBC.TablesV.DatabaseName); -- no FROM! 

es sollte wie folgt aussehen:

1) First, we do an INSERT into Spool 2. 
    2) Next, we do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by 
    way of an all-rows scan into Spool 1 (one-amp), which is 
    redistributed by the hash code of ('DBC', 'ColumnsVX') to few AMPs. 
    Then we do a SORT to order Spool 1 by row hash. The size of Spool 
    1 is estimated with high confidence to be 1 row (61 bytes). The 
    estimated time for this step is 0.01 seconds. 
    3) We do a single-AMP MERGE into 
    xxx.view_column_data_type from Spool 1 (Last Use). 
    The size is estimated with high confidence to be 1 row. The 
    estimated time for this step is 1 second. 

Natürlich Schritt 2) ein bisschen dumm, aber es gibt keinen Zugang zu dbc.tvfields, dbc.dbase usw.

Ich kann mir nicht vorstellen, dass dies in neueren Versionen geändert ...

+0

Können Sie Ihre Insert-Anweisung zu erklären: Wie können wir auf Spaltentyp von tablesV zugreifen? Darüber hinaus habe ich erklären, meine Insert-Anweisung, die scheinen zu schwer für TD sein – Rocketq

Verwandte Themen