Ich habe versucht, diese Abfrage effizient in den letzten zwei Tagen zu erhalten. Ich habe mehr über das Oracle-Index-Verhalten erfahren, von dem ich denke, dass ich an dieser Stelle verwirrt bin, was funktionieren soll und was nicht.Oracle: Optimieren zweimal Self-Join-Abfrage
Grundsätzlich besteht die Abfrage darin, Werte zu summieren und mit Werten von gestern und letzter Woche zu vergleichen.
Ich habe herum gespielt mit Breakdown, ich habe in meinem Kopf analytische Abfragen und wechselnde Reihenfolge der Indizes gespielt, aber nichts scheint zu funktionieren. All mein Test war auf einem Tisch mit 500K Reihen, sobald ich ihn auf einem Tisch mit 20 Millionen Reihen laufen lasse dauert es einfach ewig.
Jede Hilfe wird sehr geschätzt.
Ich habe den ursprünglichen Beitrag geändert, damit Sie mir helfen können. :)
CREATE TABLE TABLE_1
(ORDER_LINE_ID NUMBER, OFFSET NUMBER, BREAK_ID NUMBER, ZONE NUMBER, NETWORK NUMBER, HOUR_OF_DAY NUMBER, START_TIME DATE, END_TIME DATE, SUCCESS NUMBER
CONSTRAINT "TABLE_1_PK" PRIMARY KEY (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, HOUR_OF_DAY))
-- SUCCESS is already aggregated during the insert
-- These are last week's records
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (1,0,1, 1, 1, 2016042001,'04/20/2016 00:00:00', '04/20/2016 02:00:00', 1);
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (1,30,1, 1, 1, 2016042001,'04/20/2016 00:00:00', '04/20/2016 02:00:00', 2);
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (2,0,1, 1, 1, 2016042001,'04/20/2016 00:00:00', '04/20/2016 02:00:00', 1);
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (2,30,1, 1, 1, 2016042001,'04/20/2016 00:00:00', '04/20/2016 02:00:00', 1);
-- These are yesterday's records
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (3,0,1, 1, 1, 2016042601,'04/26/2016 00:00:00', '04/26/2016 02:00:00', 1);
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (3,30,1, 1, 1, 2016042601,'04/26/2016 00:00:00', '04/26/2016 02:00:00', 2);
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (4,0,1, 1, 1, 2016042601,'04/26/2016 00:00:00', '04/26/2016 02:00:00', 1);
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (4,30,1, 1, 1, 2016042601,'04/26/2016 00:00:00', '04/26/2016 02:00:00', 1);
-- This is today's records
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (5,0,1, 1, 1, 2016042701,'04/27/2016 00:00:00', '04/27/2016 02:00:00', 1);
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (5,30,1, 1, 1, 2016042701,'04/27/2016 00:00:00', '04/27/2016 02:00:00', 1);
-- Original twice join query
SELECT BREAK_ID, ORDER_LINE_ID, HOUR_OF_DAY, OFFSET, ZONE, NETWORK, START_TIME, END_TIME, SUM(SUCCESS), SUM(YESTERDAY_SUCCESS), SUM(LAST_WEEK_SUCCESS)
FROM TABLE_1 CURRENT_DAY
LEFT OUTER JOIN (
SELECT SUM(SUCCESS) YESTERDAY_SUCCESS, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME FROM TABLE_1
GROUP BY ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME
) YESTERDAY
ON YESTERDAY.START_TIME + 1 = CURRENT_DAY.START_TIME
AND YESTERDAY.END_TIME + 1 = CURRENT_DAY.END_TIME
AND YESTERDAY.HOUR_OF_DAY = CURRENT_DAY.HOUR_OF_DAY
AND YESTERDAY.NETWORK = CURRENT_DAY.NETWORK
AND YESTERDAY.ZONE = CURRENT_DAY.ZONE
LEFT OUTER JOIN (
SELECT SUM(SUCCESS) LAST_WEEK_SUCCESS, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME FROM TABLE_1
GROUP BY ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME
) LAST_WEEK
ON YESTERDAY.START_TIME + 7 = CURRENT_DAY.START_TIME
AND YESTERDAY.END_TIME + 7 = CURRENT_DAY.END_TIME
AND YESTERDAY.HOUR_OF_DAY = CURRENT_DAY.HOUR_OF_DAY
AND YESTERDAY.NETWORK = CURRENT_DAY.NETWORK
AND YESTERDAY.ZONE = CURRENT_DAY.ZONE
GROUP BY BREAK_ID, ORDER_LINE_ID, HOUR_OF_DAY, OFFSET, ZONE, NETWORK, START_TIME, END_TIME;
-- Using Analytic Query (thank you to MT0)
SELECT BREAK_ID, ORDER_LINE_ID, HOUR_OF_DAY, OFFSET, ZONE, NETWORK, START_TIME, END_TIME, SUM(SUCCESS), SUM(YESTERDAY_SUCCESS), SUM(LAST_WEEK_SUCCESS)
FROM (
SUM(SUCCESS)
OVER (PARTITION BY ZONE, NETWORK, HOUR_OF_DAY, TO_CHAR(START_TIME, 'HH24:MI:SS'), TO_CHAR(END_TIME, 'HH24:MI:SS')
ORDER BY START_TIME
RANGE BETWEEN INTERVAL '1' DAY PRECDEDING AND INTERVAL '1' DAY PRECEDING
) AS YESTERDAY_SUCCESS,
SUM (SUCCESS)
OVER (PARTITION BY ZONE, NETWORK, HOUR_OF_DAY, TO_CHAR(START_TIME, 'HH24:MI:SS'), TO_CHAR(END_TIME, 'HH24:MI:SS')
ORDER BY START_TIME
RANGE BETWEEN INTERVAL '7' DAY PRECDEDING AND INTERVAL '7' DAY PRECEDING
) AS LAST_WEEK_SUCCESS
FROM TABLE_1
) T1
WHERE SYSDATE - INTERVAL '12' HOUR <= START_TIME
AND START_TIME < SYSDATE - INTERVAL '1' HOUR
GROUP BY BREAK_ID, ORDER_LINE_ID, HOUR_OF_DAY, OFFSET, ZONE, NETWORK, START_TIME, END_TIME;
Ich muss sagen, danke für die Hilfe, um diese Frage auf etwas zu bringen, das ich hoffe, ist verständlicher. Alles funktioniert wie erwartet, aber die Performance könnte etwas tunen.
1,8 Sekunden auf einem Tisch mit 500K Reihen
400 Sekunden auf einem Tisch mit 20 Millionen Zeilen
ich mag auch einige Ausführungspläne von Oracle zur Verfügung gestellt hinzuzufügen. Ich habe Probleme beim Tuning der Leistung.
-- using twice self join
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | O/1/M |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 50 |00:00:00.84 | 99875 | 217 | 1705 | | | |
| 1 | HASH GROUP BY | | 1 | 6711 | 50 |00:00:00.84 | 99875 | 217 | 1705 | 1616K| 995K| |
|* 2 | FILTER | | 1 | | 119K|00:00:00.65 | 99875 | 0 | 0 | | | |
| 3 | NESTED LOOPS OUTER | | 1 | 54M| 119K|00:00:00.64 | 99875 | 0 | 0 | | | |
|* 4 | HASH JOIN OUTER | | 1 | 109 | 119K|00:00:00.52 | 99875 | 0 | 0 | 13M| 2093K| 1/0/0|
| 5 | TABLE ACCESS BY INDEX ROWID| TABLE_1_IDX | 1 | 109 | 119K|00:00:00.14 | 85908 | 0 | 0 | | | |
|* 6 | INDEX RANGE SCAN | START_TIME_IDX | 1 | 109 | 119K|00:00:00.02 | 320 | 0 | 0 | | | |
| 7 | VIEW | | 1 | 1250 | 29311 |00:00:00.23 | 13967 | 0 | 0 | | | |
| 8 | HASH GROUP BY | | 1 | 1250 | 29311 |00:00:00.22 | 13967 | 0 | 0 | 3008K| 1094K| 1/0/0|
|* 9 | FILTER | | 1 | | 88627 |00:00:00.20 | 13967 | 0 | 0 | | | |
|* 10 | TABLE ACCESS FULL | TABLE_1 | 1 | 1250 | 88627 |00:00:00.19 | 13967 | 0 | 0 | | | |
| 11 | VIEW | | 119K| 499K| 0 |00:00:00.10 | 0 | 0 | 0 | | | |
| 12 | SORT GROUP BY | | 119K| 499K| 0 |00:00:00.08 | 0 | 0 | 0 | 1024 | 1024 | 1/0/0|
|* 13 | FILTER | | 119K| | 0 |00:00:00.02 | 0 | 0 | 0 | | | |
| 14 | TABLE ACCESS FULL | TABLE_1 | 0 | 499K| 0 |00:00:00.01 | 0 | 0 | 0 | | | |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter([email protected]!-17<[email protected]!-16)
4 - access("YESTERDAY"."ZONE"="T1"."ZONE" AND "YESTERDAY"."NETWORK"="T1"."NETWORK" AND "YESTERDAY"."HOUR_OF_DAY"="T1"."HOUR_OF_DAY"
AND "T1"."END_TIME"=INTERNAL_FUNCTION("YESTERDAY"."END_TIME")+1 AND
"T1"."START_TIME"=INTERNAL_FUNCTION("YESTERDAY"."START_TIME")+1)
6 - access("T1"."START_TIME">[email protected]!-17 AND "T1"."START_TIME"<[email protected]!-16)
9 - filter([email protected]!-17<[email protected]!-16)
10 - filter((INTERNAL_FUNCTION("START_TIME")+1>[email protected]!-17 AND INTERNAL_FUNCTION("START_TIME")+1<[email protected]!-16))
13 - filter(("YESTERDAY"."ZONE"="T1"."ZONE" AND "YESTERDAY"."NETWORK"="T1"."NETWORK" AND "YESTERDAY"."HOUR_OF_DAY"="T1"."HOUR_OF_DAY"
AND "T1"."END_TIME"=INTERNAL_FUNCTION("YESTERDAY"."END_TIME")+7 AND
"T1"."START_TIME"=INTERNAL_FUNCTION("YESTERDAY"."START_TIME")+7))
Eine weitere Ausführungsplan
-- using analytic query
-------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | O/1/M |
-------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 50 |00:00:01.51 | 13967 | | | |
| 1 | HASH GROUP BY | | 1 | 499K| 50 |00:00:01.51 | 13967 | 98M| 7788K| |
|* 2 | VIEW | | 1 | 499K| 119K|00:00:01.15 | 13967 | | | |
| 3 | WINDOW SORT | | 1 | 499K| 499K|00:00:01.43 | 13967 | 66M| 2823K| 1/0/0|
|* 4 | FILTER | | 1 | | 499K|00:00:00.16 | 13967 | | | |
| 5 | TABLE ACCESS FULL| TABLE_1 | 1 | 499K| 499K|00:00:00.12 | 13967 | | | |
-------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("T1"."START_TIME">[email protected]!-INTERVAL'+17 00:00:00' DAY(2) TO SECOND(0) AND
"T1"."START_TIME"<[email protected]!-INTERVAL'+16 00:00:00' DAY(2) TO SECOND(0)))
4 - filter([email protected]!-INTERVAL'+17 00:00:00' DAY(2) TO SECOND(0)<[email protected]!-INTERVAL'+16 00:00:00' DAY(2) TO
SECOND(0))
Wie Sie sehen können mit Analytic Query (danken wieder MT0), habe ich einen Index für START_TIME hinzugefügt, die von der Abfrage Vorteile Selbstmitmach aber schätzt vs Wirklichkeiten sind ausgeschaltet . Wo Analytic Query gerade entscheidet, will es nichts mit dem Index zu tun haben. Irgendwelche Ideen, Bezugspunkte oder Hilfe werden sehr geschätzt. Vielen Dank im Voraus, alle.
Können Sie einige Beispieldaten posten, um zu erläutern, was Sie zu tun versuchen? Sind 'field_6' und' field_7' Daten ohne Zeiten oder haben sie Zeitkomponenten und gibt es mehrere Werte innerhalb derselben Gruppe? – MT0
Sie sind äußere Verbindung auf 'field_6' und' field_7' - woher wissen Sie, dass die Verbindungen gestern und letzte Woche die gleichen Werte für die Felder 1, 2 und 3 haben (auf denen Sie gruppieren)? – MT0
MT0, danke für Ihre Antwort. Ich änderte die Abfrage, um realistischere Feldnamen anzuzeigen. Ich habe auch die Typen angegeben. Danke für deine Hilfe. – OneClutteredMind