Thursday 22 March 2018






Tables Used by Concurrent Request Concurrent Program




FND_CONCURRENT_REQUESTS
Contains a complete history of all concurrent requests (both past history and those scheduled to run in the future).
FND_RUN_REQUESTS
Stores information about the reports in a report set that a user submits including the report set’s parameter values.
FND_CONC_REQUEST_ARGUMENTS
Records all arguments passed by Concurrent Managers to concurrent requests as those requests are running.
FND_DUAL
Records when a request does not update any database tables.
FND_CONCURRENT_PROCESSES
Records information about Oracle Applications processes and OS processes.
FND_CONC_STAT_LIST
Collects runtime performance statistics for concurrent requests.
FND_CONC_STAT_SUMMARY
Contains Concurrent Program performance statistics generated by the Purge Concurrent Request program or the manager data program. These programs use the data inFND_CONC_STAT_LIST to compute these statistics।



Oracle Apps Concurrent Programs Basics

Lets first discuss the below scenario
Being a developer, you have just developed a SQL script or a PL/SQL package procedure. The end user wants to be able to run this script ad-hoc or they wish to schedule this to run every night.

Question : How can an end-user be given control to run a script developed by a developer, given that an end user will never have access to apps password(and rightly so)?
Answer: This script can be attached to a Concurrent Program via a concurrent program executable. The user will then be given access to this Concurrent Program.

Question : But how will the end user or Oracle Apps make this script run every 10hours daily?
Answer : A concurrent program can be scheduled to run at desired intervals. The schedule is defined at the time of submission.

Question: What are the different types of scripts/programs that can be attached to concurrent programs?
Answer : 
A sql script
A Sql*Loader program
Java program
Oracle Report
Pl/SQL program ( or call it a stored procedure/package procedure)
Host script, like a unix shell script
Pro*C/Spawned
Perl

Question: What are the basic steps when defining a concurrent program.
Answer: Broadly speaking there are three steps when developing a concurrent program in Oracle Apps
Step 1. Make Oracle Apps identify the executable
Step 2. Provide a handle to the executable by means of defining a concurrent program
Step 3. Make this concurrent program accesible to selected users via their responsibility.

Question: Please can you describe the above steps in detail
Answer: 
Step 1. Make Oracle Apps identify the executable
In Oracle Apps we have something called as concurrent program executable. Concurrent program executable is defined to register a script or a procedure for its usage within oracle apps.

Step 2. Provide a handle to the executable by means of defining a Concurrent Program.
We need to define a concurrent program and attach that to the executable defined in above step.

Step 3. Make this concurrent program accesible to selected users via their responsibility.
We do this by adding the concurrent program to something called as request group. The request group is either associated with a responsibility or is passed in as a parameter to the function request form function. Don't worry if you do not understand this step, I will be writing a dedicated article to explain this step.


Question : Please explain the steps for defining a pl/sql concurrent program, with screenshots and with almost real life example?
Answer: 
1. Create a table and a stored procedure in pl/sql
create table xx_hello_world_tab
(
 message_text VARCHAR2(
100)
,creation_date DATE
) ;

CREATE OR REPLACE PROCEDURE
xx_register_user_prc(errbuf      OUT VARCHAR2,retcode     OUT VARCHAR2IS
BEGIN
  INSERT INTO xx_hello_world_tab VALUES (
'Hello World' ,SYSDATE);
END xx_register_user_prc;

SELECT * FROM xx_hello_world_tab ;

--Zero records will be returned here

2. Note the two parameters, namely errbuff and retcode
These parameters for used for the interaction between the concurrent program and the pl/sql Program. I suggest to the beginners not to spend much time on these parameters.

3. Define an executable attached to this procedure.
This can be done by navigating to the responsibility “Application Developer”, and selecting the menu /Concurrent/Executable
Image

4. Define the concurrent program
This can be done by navigating to responsibility “Application Developer” and then selecting menu /Concurrent/Program

Image
5. Attach this program to a report group.Go to System Administrator responsibility, and select
/Security/Responsibility/Request
Image
Now, we can navigate to the Receivables Manager and submit the concurrent request.
After the completion of the request, we check the tableImage

Monday 12 March 2018

How to effectively tune a query that does not even finish (SQL Monitoring and ASH in action)


Have you ever been called to tune a query that runs forever?

Even worse, what if the query is very complex, you don't know anything about the "logic" 
behind the SQL - since it was written by someone else - and when you see the execution
 plan it is several pages long?
Moreover, what if the query is for retrieving a list of customers for a business critical 
campaign that has to run asap, and therefore the beloved management is over your shoulder,
 waiting for you to solve the problem!

Problem Definition


Never ending queries, or at least queries that cannot finish within an acceptable time 
frame are quite common in large Data Warehouses. Unfortunately, in this situation one
 cannot use methods where execution statistics are gathered first and then analysi
s of the execution steps can take place. Simply because the query does not finish!


For example such  a great tuning method would be the following:

  1. execute the query with the hint /*+ gather_plan_statistics */ 
  2. (or alternatively issuing an ALTER SESSION SET STATISTICS_LEVEL=ALL prior to query execution), in order to enable the collection of the so-called "row-source statistics
  3. (i.e., execution statistics for each operation (row-source) in the execution plan).
  4. After the completion of the query, use DBMS_XPLAN.DISPLAY_CURSOR, or directly query V$SQL_PLAN_STATISTICS_ALL, in order to:
    1. Identify problematic cardinality estimations by the optimizer by comparing actual rows
    2.  retrieved in each step of the plan versus estimated rows.
    3. Identify bottlenecks in the plan, i.e., operations that take too long to complete.

(By the way, two great posts that describe how to use DBMS_XPLAN.DISPLAY_CURSOR to tune a query, can be found here(by Kyle Hailey) and here (by Nikolay Savvinov)).

So what can we do? We need a way to get accurate execution statistics at the individual operation level, so as to easily identify the bottlenecks, while the query is still running. What are our options?

11g has the Solution!


Well, from version 11g and onwards there is a great feature called Real-Time SQL Monitoring that allows you to monitor a query executing in real-time and provides useful statistics at an individual plan operation level that help you easily (and quickly!) identify the bottleneck in an execution plan. (Note that the Real-Time SQL Monitoring feature is licensed with Oracle Tuning Pack in order to use it in production). If you want a more detailed description of this feature check out this Oracle white paper and of course the documentation (here).

Another option is to use one of the greatest-ever Oracle features (available since 10g), the  Active Session History (ASH) (licensed with Oracle Diagnostic Pack). Note that from version 11g, ASH has become even greater since it includes columns indicating the individual execution plan operation and thus making the identification of time-consuming steps very easy.

In this post, we will give an example of a complex query, that runs forever and describe how we can tune it (i.e., to identify the bottleneck and then try to fix it) with the use of both of these tools (SQL monitoring and ASH).

(Note: A nice post from N. Savvinov on a similar topic can be found here.)

A real-life Example

So we have a complex query that has been running for some hours and the execution plan, which is retrieved with a call like this:
nkarag@DWHPRD> select * from table( dbms_xplan.display_cursor('&sql_id', '&child_number', 'ALLSTATS LAST'))
2 /
Listing 1: How we retrieve the plan of a running query based on sql_id. ---------------------------------------------------------------------------------------------------------------------
 looks like this:
Plan hash value: 2144277924
---------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | E-Rows | OMem | 1Mem | Used-Mem | Used-Tmp|
| 0 | SELECT STATEMENT | | | | | | |
---------------------------------------------------------------------------------------------------------------------------------
| 1 | TEMP TABLE TRANSFORMATION | | | | | | |
| 4 | VIEW | VW_NSO_1 | 642K| | | | |
| 2 | LOAD AS SELECT | | | 256K| 256K| | |
|* 3 | HASH JOIN RIGHT SEMI | | 642K| 25M| 3344K| | |
|* 9 | INDEX STORAGE FAST FULL SCAN | UCM_CUSTOMER_DIM_UK2 | 7462K| | | | |
|* 5 | HASH JOIN | | 642K| 47M| 5162K| | |
|* 6 | HASH JOIN | | 642K| 586K| 586K| | |
|* 8 | TABLE ACCESS STORAGE FULL | M_W_INSTALL_ASSETS | 10M| | | | |
|* 7 | TABLE ACCESS STORAGE FULL | PRODUCT_DIM | 900 | | | | |
|* 10 | INDEX STORAGE FAST FULL SCAN | UCM_CUSTOMER_DIM_UK2 | 7462K| | | | |
| 15 | PX SEND QC (RANDOM) | :TQ10006 | 872K| | | | |
| 11 | HASH UNIQUE | | 1 | 436K| 436K| | |
| 12 | CONCATENATION | | | | | | |
| 14 | PX COORDINATOR | | | | | | |
|* 13 | FILTER | | | | | | |
|* 16 | HASH JOIN RIGHT OUTER BUFFERED | | 872K| 686K| 686K| | |
| 22 | PX RECEIVE | | 981K| | | | |
| 17 | BUFFER SORT | | | 4096 | 4096 | | |
| 18 | PX RECEIVE | | 59 | | | | |
| 20 | TABLE ACCESS STORAGE FULL | SEGMENT_OTE_BASIC_DIM | 59 | | | | |
| 19 | PX SEND BROADCAST | :TQ10002 | 59 | | | | |
|* 21 | HASH JOIN | | 872K| 60M| 5023K| 9M (0)| |
|* 30 | HASH JOIN | | 11M| 278M| 13M| 52M (0)| |
| 23 | PX SEND HASH | :TQ10004 | 981K| | | | |
|* 24 | HASH JOIN BUFFERED | | 981K| 84M| 7595K| 21M (0)| 6144 |
| 26 | PX RECEIVE | | 642K| | | | |
| 25 | BUFFER SORT | | | 14M| 1032K| | |
| 27 | PX SEND BROADCAST | :TQ10000 | 642K| | | | |
| 31 | BUFFER SORT | | | 46M| 1767K| | |
| 28 | VIEW | | 642K| | | | |
| 29 | TABLE ACCESS STORAGE FULL | SYS_TEMP_0FD9FD04F_8289D6BF | 642K| | | | |
| 39 | PX RECEIVE | | 10M| | | | |
| 32 | PX RECEIVE | | 7462K| | | | |
| 33 | PX SEND HASH | :TQ10001 | 7462K| | | | |
| 35 | PX RECEIVE | | 11M| | | | |
|* 34 | TABLE ACCESS STORAGE FULL | UCM_CUSTOMER_DIM | 7462K| | | | |
| 36 | PX SEND HASH | :TQ10003 | 11M| | | | |
| 40 | PX SEND HASH | :TQ10005 | 10M| | | | |
| 37 | PX BLOCK ITERATOR | | 11M| | | | |
|* 38 | TABLE ACCESS STORAGE FULL | BILLING_ACCOUNT_DIM | 11M| | | | |
| 49 | PX RECEIVE | | 32198 | | | | |
| 41 | PX BLOCK ITERATOR | | 10M| | | | |
|* 42 | TABLE ACCESS STORAGE FULL | CLI_DIM | 10M| | | | |
| 44 | TABLE ACCESS STORAGE FULL | SYS_TEMP_0FD9FD04F_8289D6BF | 642K| | | | |
|* 43 | VIEW | | 642K| | | | |
|* 45 | FILTER | | | | | | |
|* 48 | HASH JOIN | | 29649 | 4663K| 1167K| | |
| 46 | PX COORDINATOR | | | | | | |
| 47 | PX SEND QC (RANDOM) | :TQ20006 | 29649 | | | | |
|* 60 | HASH JOIN RIGHT OUTER BUFFERED| | 374K| 686K| 686K| | |
| 50 | PX SEND BROADCAST | :TQ20005 | 32198 | | | | |
|* 51 | HASH JOIN BUFFERED | | 32198 | 25M| 3498K| | |
| 52 | BUFFER SORT | | | 73728 | 73728 | | |
| 54 | PX SEND HASH | :TQ20002 | 642K| | | | |
| 53 | PX RECEIVE | | 642K| | | | |
| 55 | VIEW | | 642K| | | | |
| 58 | PX SEND HASH | :TQ20004 | 374K| | | | |
| 56 | TABLE ACCESS STORAGE FULL | SYS_TEMP_0FD9FD04F_8289D6BF | 642K| | | | |
| 57 | PX RECEIVE | | 374K| | | | |
| 72 | PX BLOCK ITERATOR | | 11M| | | | |
|* 59 | FILTER | | | | | | | | 61 | BUFFER SORT | | | 73728 | 73728 | | |
| 62 | PX RECEIVE | | 59 | | | | |
| 64 | TABLE ACCESS STORAGE FULL | SEGMENT_OTE_BASIC_DIM | 59 | | | | |
| 63 | PX SEND BROADCAST | :TQ20000 | 59 | | | | |
|* 65 | HASH JOIN | | 11M| 329M| 11M| | |
| 68 | PX SEND HASH | :TQ20001 | 7462K| | | | |
| 66 | BUFFER SORT | | | 73728 | 73728 | | |
| 67 | PX RECEIVE | | 7462K| | | | |
13 - filter( IS NOT NULL)
|* 69 | TABLE ACCESS STORAGE FULL| UCM_CUSTOMER_DIM | 7462K| | | | | | 70 | PX RECEIVE | | 11M| | | | |
|* 73 | TABLE ACCESS STORAGE FULL| BILLING_ACCOUNT_DIM | 11M| | | | |
| 71 | PX SEND HASH | :TQ20003 | 11M| | | | |
| 74 | PX BLOCK ITERATOR | | 10M| | | | |
|* 75 | TABLE ACCESS STORAGE FULL | CLI_DIM | 10M| | | | |
| 77 | TABLE ACCESS STORAGE FULL | SYS_TEMP_0FD9FD04F_8289D6BF | 642K| | | | |
|* 76 | VIEW | | 642K| | | | |
---------------------------------------------------------------------------------------------------------------------------------
7 - storage(("A"."ROOT_PRODUCT_DESC"='IP TV' OR "A"."ROOT_PRODUCT_DESC"='TV SAT' OR "A"."ROOT_PRODUCT_DESC"='TV SAT
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("SBL_CUSTOMER_ID"="SIEBEL_CUSTOMER_ID")
3 - access("SBL_CUSTOMER_ID"="C1")
9 - storage("SBL_CUSTOMER_ID" IS NOT NULL)
6 - access("A"."SOC_PRODUCT_SK"="PRODUCT_SK")
Hotel'))
filter(("A"."ROOT_PRODUCT_DESC"='IP TV' OR "A"."ROOT_PRODUCT_DESC"='TV SAT' OR "A"."ROOT_PRODUCT_DESC"='TV SAT
Hotel')) 8 - storage("STATUS"='Active') filter("STATUS"='Active')
76 - filter("D1"."C1"=:B1)
filter("SBL_CUSTOMER_ID" IS NOT NULL)
10 - storage("SBL_CUSTOMER_ID" IS NOT NULL)
filter("SBL_CUSTOMER_ID" IS NOT NULL)
16 - access("A"."OTE_BASIC_SEGMENT_SK"="B"."OTE_BASIC_SEGMENT_SK")
21 - access("GNV_BILLING_ACCOUNT_NUM"="GNV_ACCOUNT_NUM")
24 - access("SBL_CUSTOMER_ID"="D1"."C1")
30 - access("SBL_CUSTOMER_ID"="SBL_CUSTOMER_ID")
34 - storage("SBL_CUSTOMER_ID" IS NOT NULL)
38 - storage(:Z>=:Z AND :Z<=:Z)
filter("SBL_CUSTOMER_ID" IS NOT NULL)
42 - storage(:Z>=:Z AND :Z<=:Z)
48 - access("GNV_BILLING_ACCOUNT_NUM"="GNV_ACCOUNT_NUM")
43 - filter("D1"."C1"=:B1)
45 - filter(LNNVL( IS NOT NULL))
51 - access("SBL_CUSTOMER_ID"="D1"."C1")
59 - filter(("B"."OTE_BASIC_SEGMENT_GROUP_DESC"='┼╨┴├├┼╦╠┴╘╔┼╙' OR "B"."OTE_BASIC_SEGMENT_GROUP_DESC"='╧╔╩╔┴╩╧╔'))
60 - access("A"."OTE_BASIC_SEGMENT_SK"="B"."OTE_BASIC_SEGMENT_SK")
65 - access("SBL_CUSTOMER_ID"="SBL_CUSTOMER_ID")
filter("SBL_CUSTOMER_ID" IS NOT NULL)
69 - storage("SBL_CUSTOMER_ID" IS NOT NULL)
75 - storage(:Z>=:Z AND :Z<=:Z)
73 - storage(:Z>=:Z AND :Z<=:Z AND (("ACCOUNT_TYPE_DESC"='┼╨┴├├┼╦╠┴╘╔┼╙' OR "ACCOUNT_TYPE_DESC"='╧╔╩╔┴╩╧╔')))
filter(("ACCOUNT_TYPE_DESC"='┼╨┴├├┼╦╠┴╘╔┼╙' OR "ACCOUNT_TYPE_DESC"='╧╔╩╔┴╩╧╔'))
Listing 2: The execution plan of the problematic query. ---------------------------------------------------------------------------------------------------------------------
Take a look how easy it is to identify the bottleneck with SQL monitoring. Here is the srcipt I use:
nkarag@DWHPRD> host cat mons_text.sql
set longchunksize 1000000
set long 1000000
set linesize 999
set heading off verify off autotrace off feedback off
set timing off
set wrap on
set pagesize 1000
spool sm.txt replace
accept sql_id prompt "Enter value for sql_id: "
set termout off
report_level => 'ALL') AS report
SELECT DBMS_SQLTUNE.report_sql_monitor(
sql_id => '&&sql_id',
type => 'TEXT',
FROM dual;
spool off
host cat sm.txt
undef sql_id
Listing 3: A script for retrieving SQL Monitor results for a specific sql_id ---------------------------------------------------------------------------------------------------------------------
and here are the results:
SQL Plan Monitoring Details (Plan Hash Value=2144277924)
==========================================================================================================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Read | Read | Write | Write | Cell | Mem | Temp | Activity | Activity Detail |
==========================================================================================================================================================================================================================================================
| | | | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes | Reqs | Bytes | Offload | | | (%) | (# samples) |
| 0 | SELECT STATEMENT | | | | | | 1 | | | | | | | | | | |
| 3 | HASH JOIN RIGHT SEMI | | 642K | 414K | 6 | +8 | 1 | 326K | | | | | | | | 0.05 | Cpu (2) |
| 1 | TEMP TABLE TRANSFORMATION | | | | | | 1 | | | | | | | | | | |
| 2 | LOAD AS SELECT | | | | 1 | +13 | 1 | 1 | | | 22 | 5MB | | | | | |
| 8 | TABLE ACCESS STORAGE FULL | M_W_INSTALL_ASSETS | 10M | 392K | 5 | +4 | 1 | 31M | 24205 | 23GB | | | 96.31% | | | 0.02 | Cpu (1) |
| 4 | VIEW | VW_NSO_1 | 642K | 404K | 6 | +8 | 1 | 742K | | | | | | | | | |
| 5 | HASH JOIN | | 642K | 404K | 10 | +4 | 1 | 742K | | | | | | | | 0.05 | Cpu (2) |
| 7 | TABLE ACCESS STORAGE FULL | PRODUCT_DIM | 900 | 1355 | 4 | +1 | 1 | 926 | 376 | 36MB | | | | | | 0.05 | cell multiblock physical read (2) |
| 6 | HASH JOIN | | 642K | 393K | 7 | +2 | 1 | 742K | | | | | | | | 0.07 | Cpu (3) |
| 9 | INDEX STORAGE FAST FULL SCAN | UCM_CUSTOMER_DIM_UK2 | 7M | 4659 | 6 | +8 | 1 | 7M | 341 | 315MB | | | | | | 0.07 | Cpu (1) |
| -> 13 | FILTER | | | | 4168 | +24 | 1 | 791K | | | | | | | | 0.07 | Cpu (3) |
| | | | | | | | | | | | | | | | | | cell multiblock physical read (2) |
| 10 | INDEX STORAGE FAST FULL SCAN | UCM_CUSTOMER_DIM_UK2 | 7M | 4659 | 1 | +13 | 1 | 7M | | | | | | | | | |
| -> 12 | CONCATENATION | | | | 4168 | +24 | 1 | 791K | | | | | | | | | |
| -> 11 | HASH UNIQUE | | 1 | 1M | 4168 | +24 | 1 | 0 | | | 85 | 8MB | | 3M | 10M | 0.05 | Cpu (2) |
| -> 14 | PX COORDINATOR | | | | 4168 | +24 | 9 | 791K | 10 | 320KB | | | | | | 0.02 | Cpu (1) |
| 20 | TABLE ACCESS STORAGE FULL | SEGMENT_OTE_BASIC_DIM | 59 | 4 | 1 | +18 | 1 | 59 | | | | | | | | | |
| -> 15 | PX SEND QC (RANDOM) | :TQ10006 | 872K | 186K | 4167 | +25 | 8 | 796K | | | | | | | | | |
| -> 16 | HASH JOIN RIGHT OUTER BUFFERED | | 872K | 186K | 4174 | +18 | 8 | 796K | 296 | 65MB | 406 | 89MB | | 35M | 112M | | |
| 18 | PX RECEIVE | | 59 | 4 | 2 | +18 | 8 | 472 | | | | | | | | | |
| 17 | BUFFER SORT | | | | 2 | +18 | 8 | 472 | | | | | | | | | |
| 19 | PX SEND BROADCAST | :TQ10002 | 59 | 4 | 1 | +18 | 1 | 472 | | | | | | | | | |
| 28 | VIEW | | 642K | 217 | 1 | +14 | 1 | 326K | | | | | | | | | |
| 21 | HASH JOIN | | 872K | 186K | 17 | +18 | 8 | 1M | | | | | | | | 0.07 | Cpu (3) |
| 22 | PX RECEIVE | | 982K | 160K | 2 | +18 | 8 | 717K | | | | | | | | | |
| 24 | HASH JOIN BUFFERED | | 982K | 160K | 1 | +18 | | | | | | | | | | 0.07 | Cpu (3) |
| 23 | PX SEND HASH | :TQ10004 | 982K | 160K | | | | | | | | | | | | | |
| 25 | BUFFER SORT | | | | | | | | | | | | | | | | |
| 29 | TABLE ACCESS STORAGE FULL | SYS_TEMP_0FD9FD04F_8289D6BF | 642K | 217 | 1 | +14 | 1 | 326K | 7 | 5MB | | | | | | | |
| 26 | PX RECEIVE | | 642K | 217 | | | | | | | | | | | | | |
| 27 | PX SEND BROADCAST | :TQ10000 | 642K | 217 | 1 | +14 | 1 | 3M | | | | | | | | | |
| 37 | PX BLOCK ITERATOR | | 11M | 18775 | 3 | +17 | 8 | 11M | | | | | | | | | |
| 30 | HASH JOIN | | 11M | 160K | 2 | +17 | | | | | | | | | | 0.26 | Cpu (11) |
| 31 | BUFFER SORT | | | | 2 | +16 | | | | | | | | | | 0.07 | Cpu (3) |
| 33 | PX SEND HASH | :TQ10001 | 7M | 141K | 5 | +14 | 1 | 7M | | | | | | | | | |
| 32 | PX RECEIVE | | 7M | 141K | | | | | | | | | | | | | |
| 34 | TABLE ACCESS STORAGE FULL | UCM_CUSTOMER_DIM | 7M | 141K | 5 | +14 | 1 | 7M | 14649 | 8GB | | | 92.70% | | | 0.05 | Cpu (2) |
| 38 | TABLE ACCESS STORAGE FULL | BILLING_ACCOUNT_DIM | 11M | 18775 | 3 | +17 | 120 | 11M | 13150 | 7GB | | | 80.00% | | | 0.17 | Cpu (6) |
| 35 | PX RECEIVE | | 11M | 18775 | | | | | | | | | | | | | |
| 36 | PX SEND HASH | :TQ10003 | 11M | 18775 | 3 | +17 | 8 | 11M | | | | | | | | | |
| | | | | | | | | | | | | | | | | | resmgr:cpu quantum (2) |
| | | | | | | | | | | | | | | | | | cell smart table scan (1) |
| 39 | PX RECEIVE | | 10M | 25360 | 17 | +18 | 8 | 10M | | | | | | | | | |
| 41 | PX BLOCK ITERATOR | | 10M | 25360 | | | | | | | | | | | | | |
| 40 | PX SEND HASH | :TQ10005 | 10M | 25360 | 1 | +21 | | | | | | | | | | 0.05 | Cpu (2) |
| 42 | TABLE ACCESS STORAGE FULL | CLI_DIM | 10M | 25360 | 6 | +19 | | | | | | | | | | 0.95 | Cpu (5) |
| -> 44 | TABLE ACCESS STORAGE FULL | SYS_TEMP_0FD9FD04F_8289D6BF | 642K | 217 | 4168 | +24 | 627K | 4G | 1 | 32768 | | | | | | 26.73 | Cpu (1124) |
| | | | | | | | | | | | | | | | | | cell single block physical read (35) |
| -> 43 | VIEW | | 642K | 217 | 4168 | +24 | 627K | 627K | | | | | | | | 70.76 | Cpu (2981) |
| 55 | VIEW | | 642K | 217 | | | | | | | | | | | | | |
| 45 | FILTER | | | | | | | | | | | | | | | | |
| 46 | PX COORDINATOR | | | | | | | | | | | | | | | | |
| 47 | PX SEND QC (RANDOM) | :TQ20006 | 29649 | 186K | | | | | | | | | | | | | |
| 49 | PX RECEIVE | | 32198 | 160K | | | | | | | | | | | | | |
| 48 | HASH JOIN | | 29649 | 186K | | | | | | | | | | | | | |
| 50 | PX SEND BROADCAST | :TQ20005 | 32198 | 160K | | | | | | | | | | | | | |
| 53 | PX RECEIVE | | 642K | 217 | | | | | | | | | | | | | |
| 51 | HASH JOIN BUFFERED | | 32198 | 160K | | | | | | | | | | | | | |
| 52 | BUFFER SORT | | | | | | | | | | | | | | | | |
| 67 | PX RECEIVE | | 7M | 141K | | | | | | | | | | | | | |
| 54 | PX SEND HASH | :TQ20002 | 642K | 217 | | | | | | | | | | | | | | | 56 | TABLE ACCESS STORAGE FULL | SYS_TEMP_0FD9FD04F_8289D6BF | 642K | 217 | | | | | | | | | | | | | |
| 57 | PX RECEIVE | | 374K | 160K | | | | | | | | | | | | | |
| 59 | FILTER | | | | | | | | | | | | | | | | |
| 58 | PX SEND HASH | :TQ20004 | 374K | 160K | | | | | | | | | | | | | |
| 60 | HASH JOIN RIGHT OUTER BUFFERED | | 374K | 160K | | | | | | | | | | | | | |
| 63 | PX SEND BROADCAST | :TQ20000 | 59 | 4 | | | | | | | | | | | | | |
| 61 | BUFFER SORT | | | | | | | | | | | | | | | | |
| 62 | PX RECEIVE | | 59 | 4 | | | | | | | | | | | | | |
==========================================================================================================================================================================================================================================================
| 64 | TABLE ACCESS STORAGE FULL | SEGMENT_OTE_BASIC_DIM | 59 | 4 | | | | | | | | | | | | | | | 65 | HASH JOIN | | 11M | 160K | | | | | | | | | | | | | |
| 68 | PX SEND HASH | :TQ20001 | 7M | 141K | | | | | | | | | | | | | |
| 66 | BUFFER SORT | | | | | | | | | | | | | | | | |
| 69 | TABLE ACCESS STORAGE FULL | UCM_CUSTOMER_DIM | 7M | 141K | | | | | | | | | | | | | |
| 70 | PX RECEIVE | | 11M | 18776 | | | | | | | | | | | | | |
| 72 | PX BLOCK ITERATOR | | 11M | 18776 | | | | | | | | | | | | | |
| 71 | PX SEND HASH | :TQ20003 | 11M | 18776 | | | | | | | | | | | | | |
| 73 | TABLE ACCESS STORAGE FULL | BILLING_ACCOUNT_DIM | 11M | 18776 | | | | | | | | | | | | | |
| -> 76 | VIEW | | 642K | 217 | 4168 | +24 | 627K | 627K | | | | | | | | | |
| 74 | PX BLOCK ITERATOR | | 10M | 25360 | | | | | | | | | | | | | |
| 75 | TABLE ACCESS STORAGE FULL | CLI_DIM | 10M | 25360 | | | | | | | | | | | | | |
| -> 77 | TABLE ACCESS STORAGE FULL | SYS_TEMP_0FD9FD04F_8289D6BF | 642K | 217 | 4168 | +24 | 627K | 4G | | | | | | | | | |
Listing 4: SQL Monitoring results for the problematic query. ---------------------------------------------------------------------------------------------------------------------

Look at line 13 in listing 4 (the FILTER operation). The arrow on the left indicates that this step is currently executing. It has been running for 4168 seconds as the column "Time Active(s)" reports. So for sure it is among the bottlenecks that we search for. In our case, this filter operation has two children operations. Among these, one of the ones that is running for the longest period is operation 44 (TABLE ACCESS STORAGE FULL on a temp table). It is also running for 4168 seconds. If we look a bit to the right, we can immediately see the the "Execs" column for this operation is 627K, which means that this operation has been executed up to now for 627K times! Oops that's a lot of times for doing a full table scan on a large table, isn't it? Moreover, if we look a bit further to the right we will see that up to now this operation has returned (output rows) 4G rows!
So, it is quite easy to understand the problem of this execution plan is this extremely inefficient filter operation (which executes in a nested loop fashion). See how easy it was, when you use the right tools?

Can we have the same results with ASH? Oh yes we can!

Here is a little script that identifies the plan operation that runs for the largest percent of wall-clock time (the script gives you an option to choose sorting of plan operations by wall-clock time descending or by DB time descending. With the former you can identify the most time consuming step while with the latter you can identify the most resource consuming step of the plan):

nkarag@DWHPRD> host cat ash_ops.sql
set pagesize 999
set lines 999
accept how_to_sort prompt "Do you want operations sorted by DBtime desc (D) or Wall Clock time desc (W) - default is W:"
def how_to_sort = "W"
col sort_by_wctime noprint new_value _SORT_BY_WCTIME
col sort_by_dbtime noprint new_value _SORT_BY_DBTIME
set timing off
decode(upper(nvl('&&how_to_sort','W')),'W','','--') sort_by_wctime
select decode(upper(nvl('&&how_to_sort','W')),'D','','--') sort_by_dbtime,
from dual;
set timing on
count(*) "DB Time (secs)",
select SQL_EXEC_START, SQL_PLAN_LINE_ID, SQL_PLAN_OPERATION, SQL_PLAN_OPTIONS, owner, object_name, object_type,
count(distinct sample_time) "WC Time (secs)",
db_time_secs "DB Time Total (secs)",
select SQL_EXEC_START, sample_time, SQL_PLAN_LINE_ID, A.SQL_PLAN_OPERATION, SQL_PLAN_OPTIONS, owner, object_name, object_type, --count(*) over() cnttot,
wc_time_secs "WC Time Total (secs)",
round(count(*)/db_time_secs *100) "DB Time (%)",
round(count(distinct sample_time)/wc_time_secs *100) "WC Time (%)"
from (
count(*) over(partition by SQL_EXEC_START) db_time_secs,
and sql_id = nvl('&sql_id',sql_id) and SQL_CHILD_NUMBER = nvl('&SQL_CHILD_NUMBER',0)
count(distinct sample_time) over(partition by SQL_EXEC_START) wc_time_secs
from gv$active_session_history a left outer join dba_objects b on(a.CURRENT_OBJ# = b.object_id)
where
SAMPLE_TIME > sysdate - (&minutes_from_now/(24*60))
/
and sql_exec_id is not null
)
group by SQL_EXEC_START, db_time_secs, wc_time_secs, SQL_PLAN_LINE_ID, SQL_PLAN_OPERATION, SQL_PLAN_OPTIONS, owner, object_name, object_type --, cnttot
order by SQL_EXEC_START desc,
&_SORT_BY_DBTIME "DB Time (%)" desc
&_SORT_BY_WCTIME "WC Time (%)" desc
Listing 5: A script for returning the plan operation of a running query that consumed the most wall-clock time based on ASH. ---------------------------------------------------------------------------------------------------------------------
and here are the results when we run the above script for the problematic query which is executing:
SQL_EXEC_START SQL_PLAN_LINE_ID SQL_PLAN_OPERATION SQL_PLAN_OPTIONS OWNER OBJECT_NAME OBJECT_TYPE DB Time (secs) WC Time (secs) DB Time Total (secs) WC Time Total (secs) DB Time (%) WC Time (%)
------------------- ---------------- ------------------------------ ------------------------------ ------------------------------ ------------------------------ ------------------- -------------- -------------- -------------------- -------------------- ----------- -----------
27/01/2015 16:17:41 44 TABLE ACCESS STORAGE FULL 989 989 3592 3592 28 28
27/01/2015 16:17:41 43 VIEW 2595 2595 3592 3592 72 72
27/01/2015 16:17:41 13 FILTER 5 5 3592 3592 0 0
27/01/2015 16:17:41 11 HASH UNIQUE 1 1 3592 3592 0 0
27/01/2015 16:17:41 14 PX COORDINATOR 2 2 3592 3592 0 0
Listing 6: Results of ASH for the problematic query. ---------------------------------------------------------------------------------------------------------------------

We have run the script for a 60 minute window (the maximun retention period for ASH) and we clearly see that the most delay is in operations 43 and 44. The results in listing 6 are sorted in descending order of percent of wall-clock time consumed by an operation.

An even better approach would be to first run the ASH script in order to identify the most consuming operation and then, run the SQL monitoring script to get an overall view of the plan and get the more accurate execution time metrics, execution counts of each step etc. Imagine that in a very long plan, it is difficult for the human eye to catch the largest values in column "Time Active(s)".

So clearly SQL monitoring ans ASH have helped us easily and quickly identify the problem. Now, of course we have to find a fix.

How to fix the problem

In the previous section we showed how easy it is to identify the "bottleneck operations" of a running query with the help of SQL monitoring and/or ASH in version 11g. Once you identify the problem, then you have only traveled half-way because you also have to fix it. The fix of course depends  by the specific case at hand (see some brief categorization of problems here).

For our running example the problem lies in the FILTER operation (operation 13 in Listing 4). The problematic full table scan (operation 44) is a child operation of this filter. The FILTER operation with two child sources is a clear indicator that the optimizer did not manage to unnest a subquery. After some careful examination of the SQL statement, we easily managed to find the pain point! It was an IN subquery after an OR condition. Something with the following form:

SELECT
FROM
WHERE
      <condition1>
   AND <condition2>
  AND (
      <confition3>
      OR <condition 4>
      OR <condition 5>
      OR <values> IN (subquery)
   )

Once we removed the bold part above from the problematic statement the query finished in 35 seconds! Actually as it turned out the this condition was redundant and could be discarded all together.

Summary

In this post we have shown a real-life example of how to identify efficiently the most time consuming operation in the execution plan of a running query that does not finish. We have used two great 11g features, namely:
  • Real-time SQL Monitoring
  • Active Session History (essentially the new columns added in 11g that reveal the operation running)
A similar example with the use of ASH we have posted in the past here.

If you have to tune queries as your daily job then you must learn to use SQL monitoring and ASH really good. They are great life saviors!