AIOUG Performance Tuning Day ! Hyderabad…

AIOUG is hosting a full day Performance Tuning day on 30th May 2015 in Hyderabad. I will be presenting on Oracle Query Optimizer and Performance. These sessions will be a mix of technical demos and real life examples. Hope to see a large gathering. Registrations are open at Performance Tuning Day.

Advertisements

Query Performance Issue ! SQL*Plus v/s Oracle Reports…

Have editted on 1st March to include the Execution Plan of the Modified Query. See at the end.

Recently, got a mail from one of my customer on a Query Performance Issue. The Customer mentioned that the Query, when executed from SQL*Plus, takes around 10 Seconds, whereas, when the pasted in Oracle Reports and run through Concurrent Manager, it runs from around 30-40 minutes and is finally killed. The Customer also confirmed that the Query is exactly same i.e it is purely a Copy-Paste of the Query that was run from SQL*Plus and that the Bind Type & Length too are same. The run time plan for both the executions are totally different, with the plan of SQL*Plus better than the plan generated by Oracle Reports. The Question was, “Why is the plan different between SQL*Plus and Oracle Reports ?”

As usual, I asked for a 10053 trace for both the executions, which is one of the best way in investigating Optimizer Calculations. The 10053 trace revealed some important facts about the Query Performance, and therefore helped resolve the issue. Before we dive into the investigation, let me paste the Query and both the Execution Plans.

This is the Query.

SELECT /*+ INDEX(WDD WSH_DELIVERY_DETAILS_TI4) */
             WDD.ORGANIZATION_ID, 
             COUNT(*) LPNS,
             TRP.TRIP_ID,
             RC.CUSTOMER_NUMBER,
             RC.CUSTOMER_NAME, 
             RAA.ADDRESS1 TO_ADDRESS1, 
             RAA.CITY RAA.STATE TO_CITY_STATE, 
             HLA.DESCRIPTION FROM_DESC, 
             HLA.ADDRESS_LINE_1||' '||HLA.TOWN_OR_CITY||' '||HLA.TOWN_OR_CITY)  FROM_ADDRESS1_CITY, 
             WDA.DELIVERY_ID, 
             WDA.PARENT_DELIVERY_DETAIL_ID,
             WDD.DELIVERY_DETAIL_ID, 
             WDD1.CONTAINER_NAME    CONTAINER_NAME, 
             SUM(WDD.REQUESTED_QUANTITY) REQUESTED_QUANTITY, 
             WDD.REQUESTED_QUANTITY_UOM, 
             SUM(ROUND(WDD.GROSS_WEIGHT,0)) GROSS_WEIGHT, 
             WDD.WEIGHT_UOM_CODE,
             WT.VEHICLE_NUM_PREFIX ||' '||WT.VEHICLE_NUMBER VEHICLE_NUMBER,
             MSI.SEGMENT1 ITEM_CODE, 
	     MSI.ATTRIBUTE10 PRT_DESC 
FROM 	apps.WSH_DELIVERY_DETAILS WDD,
     	apps.WSH_DELIVERY_DETAILS WDD1,
        apps.WSH_DELIVERY_ASSIGNMENTS WDA ,
        apps.WSH_DELIVERY_ASSIGNMENTS WDA1,    
        apps.WSH_TRIPS WT, 
        apps.MTL_SYSTEM_ITEMS MSI, 
	TI_RA_ADDRESSES_V RAA,
	TI_RA_CUSTOMERS_V RC ,
        apps.HR_LOCATIONS_ALL HLA,
    	(SELECT DISTINCT  T.TRIP_ID, WDA.DELIVERY_ID 
    	FROM	apps.WSH_DELIVERY_ASSIGNMENTS WDA, 
          	apps.WSH_DELIVERY_LEGS DL,
          	apps.WSH_TRIP_STOPS PICKUP_STOP,  
             	apps.WSH_TRIPS T
	WHERE	WDA.DELIVERY_ID     = DL.DELIVERY_ID 
        AND     DL.PICK_UP_STOP_ID  = PICKUP_STOP.STOP_ID 
        AND     PICKUP_STOP.TRIP_ID = T.TRIP_ID
        AND     T.TRIP_ID  >=  : P_TRIPID_FR
        AND     T.TRIP_ID  = : P_TRIPID_FR 
AND 	TRP.TRIP_ID <= : P_TRIPID_TO 
GROUP BY WDD.ORGANIZATION_ID , TRP.TRIP_ID , RC.CUSTOMER_NUMBER , RC.CUSTOMER_NAME , RAA.ADDRESS1 , 
RAA.CITY , RAA.STATE , HLA.DESCRIPTION , HLA.ADDRESS_LINE_1 , HLA.TOWN_OR_CITY , WDA.DELIVERY_ID , 
WT.VEHICLE_NUM_PREFIX , WT.VEHICLE_NUMBER , WDA.PARENT_DELIVERY_DETAIL_ID , WDD.DELIVERY_DETAIL_ID , 
WDD1.CONTAINER_NAME , WDD.REQUESTED_QUANTITY_UOM , WDD.WEIGHT_UOM_CODE , MSI.SEGMENT1 , MSI.ATTRIBUTE10

The Runtime Plan of the Query when Executed from SQL*Plus.

-----------------------------------------------------------------------------------------------
| Id  | Operation                                       | Name                        | Rows  |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                                |                             |       |
|   1 |  SORT GROUP BY                                  |                             |     1 |
|   2 |   VIEW                                          | VM_NWVW_1                   |     1 |
|   3 |    SORT UNIQUE                                  |                             |     1 |
|*  4 |     FILTER                                      |                             |       |
|   5 |      NESTED LOOPS                               |                             |       |
|   6 |       NESTED LOOPS                              |                             |     1 |
|   7 |        NESTED LOOPS                             |                             |     1 |
|   8 |         NESTED LOOPS                            |                             |     1 |
|   9 |          NESTED LOOPS                           |                             |     5 |
|  10 |           NESTED LOOPS                          |                             |     5 |
|  11 |            NESTED LOOPS                         |                             |     5 |
|  12 |             NESTED LOOPS                        |                             |     5 |
|  13 |              NESTED LOOPS OUTER                 |                             |    27 |
|  14 |               NESTED LOOPS OUTER                |                             |    27 |
|  15 |                NESTED LOOPS                     |                             |    27 |
|  16 |                 NESTED LOOPS                    |                             |  1908 |
|  17 |                  NESTED LOOPS                   |                             |   427 |
|  18 |                   NESTED LOOPS                  |                             |   127 |
|  19 |                    MERGE JOIN                   |                             |    72 |
|  20 |                     TABLE ACCESS BY INDEX ROWID | WSH_TRIP_STOPS              |   187 |
|* 21 |                      INDEX RANGE SCAN           | WSH_TRIP_STOPS_N1           |   187 |
|* 22 |                     SORT JOIN                   |                             |    72 |
|  23 |                      TABLE ACCESS BY INDEX ROWID| WSH_TRIPS                   |    72 |
|* 24 |                       INDEX RANGE SCAN          | WSH_TRIPS_U1                |    72 |
|  25 |                    TABLE ACCESS BY INDEX ROWID  | WSH_DELIVERY_LEGS           |     2 |
|* 26 |                     INDEX RANGE SCAN            | WSH_DELIVERY_LEGS_N2        |     2 |
|* 27 |                   INDEX RANGE SCAN              | WSH_DELIVERY_ASSIGNMENTS_N1 |     3 |
|  28 |                  TABLE ACCESS BY INDEX ROWID    | WSH_DELIVERY_ASSIGNMENTS    |     4 |
|* 29 |                   INDEX RANGE SCAN              | WSH_DELIVERY_ASSIGNMENTS_N1 |     4 |
|* 30 |                 TABLE ACCESS BY INDEX ROWID     | WSH_DELIVERY_DETAILS        |     1 |
|* 31 |                  INDEX RANGE SCAN               | WSH_DELIVERY_DETAILS_TI4    |     1 |
|* 32 |                INDEX RANGE SCAN                 | WSH_DELIVERY_ASSIGNMENTS_N3 |     1 |
|  33 |               TABLE ACCESS BY INDEX ROWID       | WSH_DELIVERY_DETAILS        |     1 |
|* 34 |                INDEX UNIQUE SCAN                | WSH_DELIVERY_DETAILS_U1     |     1 |
|  35 |              TABLE ACCESS BY INDEX ROWID        | MTL_SYSTEM_ITEMS_B          |     1 |
|* 36 |               INDEX UNIQUE SCAN                 | MTL_SYSTEM_ITEMS_B_U1       |     1 |
|  37 |             TABLE ACCESS BY INDEX ROWID         | HZ_CUST_ACCOUNTS            |     1 |
|* 38 |              INDEX UNIQUE SCAN                  | HZ_CUST_ACCOUNTS_U1         |     1 |
|  39 |            TABLE ACCESS BY INDEX ROWID          | HZ_LOCATIONS                |     1 |
|* 40 |             INDEX UNIQUE SCAN                   | HZ_LOCATIONS_U1             |     1 |
|  41 |           TABLE ACCESS BY INDEX ROWID           | HZ_PARTY_SITES              |     1 |
|* 42 |            INDEX RANGE SCAN                     | HZ_PARTY_SITES_N2           |     1 |
|* 43 |          TABLE ACCESS BY INDEX ROWID            | HZ_CUST_ACCT_SITES_ALL      |     1 |
|* 44 |           INDEX RANGE SCAN                      | HZ_CUST_ACCT_SITES_N1       |     1 |
|  45 |         TABLE ACCESS BY INDEX ROWID             | HZ_PARTIES                  |     1 |
|* 46 |          INDEX UNIQUE SCAN                      | HZ_PARTIES_U1               |     1 |
|* 47 |        INDEX UNIQUE SCAN                        | HR_LOCATIONS_PK             |     1 |
|  48 |       TABLE ACCESS BY INDEX ROWID               | HR_LOCATIONS_ALL            |     1 |
-----------------------------------------------------------------------------------------------

Runtime Plan of the Query when Executed from Oracle Reports.

  
  -----------------------------------------------------------------------------------------------
  | Id  | Operation                                       | Name                        | Rows  |
  -----------------------------------------------------------------------------------------------
  |   0 | SELECT STATEMENT                                |                             |       |
  |   1 |  SORT ORDER BY                                  |                             |     1 |
  |   2 |   SORT GROUP BY                                 |                             |     1 |
  |   3 |    VIEW                                         | VM_NWVW_1                   |     1 |
  |   4 |     SORT UNIQUE                                 |                             |     1 |
  |*  5 |      FILTER                                     |                             |       |
  |   6 |       NESTED LOOPS                              |                             |       |
  |   7 |        NESTED LOOPS                             |                             |     1 |
  |   8 |         NESTED LOOPS                            |                             |     1 |
  |   9 |          NESTED LOOPS                           |                             |     4 |
  |  10 |           NESTED LOOPS                          |                             |     4 |
  |  11 |            NESTED LOOPS                         |                             |     1 |
  |  12 |             NESTED LOOPS OUTER                  |                             |     1 |
  |  13 |              NESTED LOOPS OUTER                 |                             |     1 |
  |  14 |               NESTED LOOPS                      |                             |     1 |
  |* 15 |                HASH JOIN                        |                             |     1 |
  |  16 |                 TABLE ACCESS FULL               | HZ_CUST_ACCT_SITES_ALL      | 36983 |
  |* 17 |                 HASH JOIN                       |                             | 23952 |
  |  18 |                  TABLE ACCESS FULL              | HZ_PARTY_SITES              | 76710 |
  |* 19 |                  HASH JOIN                      |                             | 22674 |
  |  20 |                   TABLE ACCESS FULL             | HZ_LOCATIONS                | 72598 |
  |* 21 |                   HASH JOIN                     |                             | 22737 |
  |  22 |                    TABLE ACCESS FULL            | HZ_CUST_ACCOUNTS            | 14291 |
  |* 23 |                    HASH JOIN                    |                             | 22801 |
  |  24 |                     TABLE ACCESS BY INDEX ROWID | MTL_SYSTEM_ITEMS_B          | 11073 |
  |* 25 |                      INDEX RANGE SCAN           | MTL_SYSTEM_ITEMS_B_N16      | 11073 |
  |* 26 |                     HASH JOIN                   |                             |   130K|
  |  27 |                      TABLE ACCESS FULL          | HR_LOCATIONS_ALL            |   919 |
  |* 28 |                      TABLE ACCESS BY INDEX ROWID| WSH_DELIVERY_DETAILS        |   130K|
  |* 29 |                       INDEX RANGE SCAN          | WSH_DELIVERY_DETAILS_TI4    |   143K|
  |* 30 |                TABLE ACCESS BY INDEX ROWID      | WSH_DELIVERY_ASSIGNMENTS    |     1 |
  |* 31 |                 INDEX RANGE SCAN                | WSH_DELIVERY_ASSIGNMENTS_N3 |     1 |
  |* 32 |               INDEX RANGE SCAN                  | WSH_DELIVERY_ASSIGNMENTS_N3 |     1 |
  |  33 |              TABLE ACCESS BY INDEX ROWID        | WSH_DELIVERY_DETAILS        |     1 |
  |* 34 |               INDEX UNIQUE SCAN                 | WSH_DELIVERY_DETAILS_U1     |     1 |
  |  35 |             TABLE ACCESS BY INDEX ROWID         | HZ_PARTIES                  |     1 |
  |* 36 |              INDEX UNIQUE SCAN                  | HZ_PARTIES_U1               |     1 |
  |* 37 |            INDEX RANGE SCAN                     | WSH_DELIVERY_ASSIGNMENTS_N1 |     4 |
  |  38 |           TABLE ACCESS BY INDEX ROWID           | WSH_DELIVERY_LEGS           |     1 |
  |* 39 |            INDEX RANGE SCAN                     | WSH_DELIVERY_LEGS_N1        |     1 |
  |* 40 |          TABLE ACCESS BY INDEX ROWID            | WSH_TRIP_STOPS              |     1 |
  |* 41 |           INDEX UNIQUE SCAN                     | WSH_TRIP_STOPS_U1           |     1 |
  |* 42 |         INDEX UNIQUE SCAN                       | WSH_TRIPS_U1                |     1 |
  |  43 |        TABLE ACCESS BY INDEX ROWID              | WSH_TRIPS                   |     1 |
  -----------------------------------------------------------------------------------------------

Since the investigation is around the Cardinality of each of the row source, I have removed other information (irrelevant for this discussion) from the execution plan. The Query contains a Subquery with alias as TRP. The Tables queried in the Subquery Drives the Good Plan, whereas, the Subquery is unnested and joined at a later stage in the Problematic Plan (Step 37-43). In the Problematic Plan, the Driving table is HR_LOCATIONS_ALL, which is hash joined to WSH_DELIVERY_DETAILS. The Investigation was around the change in the Execution Path. One of the primary goal of Optimizer is to calculate the Number of rows to be returned by each of the Filter Predicate and then evaluate the Access Patch, including the Driving Table, based on the predicate that fetches the least number of rows. It then also evaluates the Join Method, which is to consider Nested Loop Join for small set of rows or Hash Join / Sort Merge Join for larger set of rows.

The text of the TRP Subquery is pasted below

    	(SELECT DISTINCT  T.TRIP_ID, WDA.DELIVERY_ID 
    	FROM	apps.WSH_DELIVERY_ASSIGNMENTS WDA, 
          	apps.WSH_DELIVERY_LEGS DL,
          	apps.WSH_TRIP_STOPS PICKUP_STOP,  
             	apps.WSH_TRIPS T
	WHERE	WDA.DELIVERY_ID     = DL.DELIVERY_ID 
        AND     DL.PICK_UP_STOP_ID  = PICKUP_STOP.STOP_ID 
        AND     PICKUP_STOP.TRIP_ID = T.TRIP_ID
        AND     T.TRIP_ID  >=  : P_TRIPID_FR
        AND     T.TRIP_ID  <=  : P_TRIPID_TO )  TRP

The Good Plan for this subquery is a Sort Merge Join on WSH_TRIPS & WSH_TRIP_STOPS. Both the tables are accessed via an Index on TRIP_ID. Note that though the range predicate is on TRIP_ID of WSH_TRIPS, the column is joined to TRIP_ID of WSH_TRIP_STOPS and therefore, the optimizer scans an Index on these columns to get the relevant rows from the table. The Optimizer calculation says 72 rows from WSH_TRIPS and 187 rows from WSH_TRIP_STOPS, which are merged join to get final 72 rows from the two tables. This calculation is based on the Input values that Optimized peeked into during the hard parse. For the same set of input values, we have a 10053 trace of the Bad Plan and therefore, I immediately concentrated on the cardinality calculation for the two tables mentioned here. Surprisingly, the first caveat that I observed in the 10053 trace was missing Bind Peeked Data and this was enough to get into the Optimizer calculations.

The Peeked Bind Values section of the 10053 trace shows following information.

*******************************************
Peeked values of the binds in SQL statement
*******************************************
----- Bind Info (kkscoacd) -----
 Bind#0
  oacdty=02 mxl=22(01) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=1000000 frm=00 csi=00 siz=72 off=0
  No bind buffers allocated
 Bind#1
  oacdty=02 mxl=22(01) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=1000000 frm=00 csi=00 siz=0 off=24
  No bind buffers allocated
 Bind#2
  oacdty=02 mxl=22(01) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=1000000 frm=00 csi=00 siz=0 off=48
  No bind buffers allocated
 Bind#3
  No oacdef for this bind.
 Bind#4
  No oacdef for this bind.

When the Query was executed from SQL*Plus, the values for the Binds were populated and were visible in the Peeked Bind values section of 10053 trace. The missing information means, the Optimizer would fall back to the default calculation for a range predicate. From the trace, the computed cardinality for the two tables, WSH_TRIPS & WSH_TRIP_STOPS, are as under :

Access path analysis for WSH_TRIP_STOPS
***************************************
Access path analysis for WSH_TRIP_STOPS
***************************************
SINGLE TABLE ACCESS PATH 
  Single Table Cardinality Estimation for WSH_TRIP_STOPS[PICKUP_STOP] 

  Table: WSH_TRIP_STOPS  Alias: PICKUP_STOP
    Card: Original: 4666397.000000  Rounded: 11666  Computed: 11665.99  Non Adjusted: 11665.99

Access path analysis for WSH_TRIPS
***************************************
SINGLE TABLE ACCESS PATH 
  Single Table Cardinality Estimation for WSH_TRIPS[T] 

  Table: WSH_TRIPS  Alias: T
    Card: Original: 1802653.000000  Rounded: 4507  Computed: 4506.63  Non Adjusted: 4506.63

In the absence of the actual values provided to the Bind Variables, the Optimizer computed the Cardinality based on defaults, which in this case is 0.25% (5% for >= and 5% of = : P_TRIPID_FR and TRIP_ID <= : P_TRIPID_TO is too high. Another predicate in the Outer query is ORGANIZATION_ID and based on this predicate, optimizer generates a plan with the smallest rowsource first. The smallest rowsource is HR_LOCATIONS_ALL with 919 rows and is joined to WSH_DELIVERY_DETAILS table, which is accessed via hinted index WSH_DELIVERY_DETAILS_TI4.

To check and confirm this behaviour, I executed an EXPLAIN PLAN FOR for this query, without setting the values for the bind variables, from SQL*Plus. With no values to the Bind, the EXPLAIN PLAN FOR computed the BAD plan from SQL*Plus. Therefore, now I was sure that the problem occurs when the peeked values to the bind are not available. The next concern was “Why Oracle Report does not peek into the Bind Variables before actually generating the plan ?”.

I assume this to be a default behaviour, where the query is parsed first and then binded. Unfortunately, there is not much written on this. A better option is to write a query in an efficient way or hint the query. To resolve the issue, I modified the query and this modified version worked perfectly well on both the environments i.e. SQL*Plus and Oracle Reports. The modification was in the Subquery TRP. The change is pasted below.

    (SELECT TRIP_ID, DELIVERY_ID FROM 
    (SELECT T.TRIP_ID, WDA.DELIVERY_ID, row_number() over(partition by t.trip_id, wda.delivery_id) rnum 
                FROM   apps.WSH_DELIVERY_ASSIGNMENTS WDA, 
                                apps.WSH_DELIVERY_LEGS DL,
                                apps.WSH_TRIP_STOPS PICKUP_STOP,  
                                apps.WSH_TRIPS T
                WHERE   WDA.DELIVERY_ID     = DL.DELIVERY_ID 
        AND     DL.PICK_UP_STOP_ID  = PICKUP_STOP.STOP_ID 
        AND     PICKUP_STOP.TRIP_ID = T.TRIP_ID
        AND     T.TRIP_ID  >=  : P_TRIPID_FR
        AND     T.TRIP_ID  <=  : P_TRIPID_TO)
     WHERE RNUM=1)  TRP

The modification made sure that the TRP subquery becomes the Driving Query. The relevant portion of the execution plan post modification is attached below. The calculation for WSH_TRIPS is still at 0.25% and is therefore computed as 4507, even then, the Optimizer takes this as a Driving Table, which is what we wanted. This again leads to a common argument, which is, Optimizer is a piece of Code that works on Statistics. It is the way we write a query that dominates the Optimizer.


|* 19 |                  WINDOW SORT PUSHED RANK         |                             | 26689 |
|* 20 |                   FILTER                         |                             |       |
|  21 |                    NESTED LOOPS                  |                             | 26689 |
|  22 |                     NESTED LOOPS                 |                             |  7968 |
|  23 |                      NESTED LOOPS                |                             |  4516 |
|* 24 |                       INDEX RANGE SCAN           | WSH_TRIPS_U1                |  4507 |
|  25 |                       TABLE ACCESS BY INDEX ROWID| WSH_TRIP_STOPS              |     1 |
|* 26 |                        INDEX RANGE SCAN          | WSH_TRIP_STOPS_N1           |     1 |
|  27 |                      TABLE ACCESS BY INDEX ROWID | WSH_DELIVERY_LEGS           |     2 |
|* 28 |                       INDEX RANGE SCAN           | WSH_DELIVERY_LEGS_N2        |     2 |
|* 29 |                     INDEX RANGE SCAN             | WSH_DELIVERY_ASSIGNMENTS_N1 |     3 |

  20 - filter(:P_TRIPID_FR=:P_TRIPID_FR AND "T"."TRIP_ID"=:P_TRIPID_FR AND "PICKUP_STOP"."TRIP_ID"<=:P_TRIPID_TO))
  28 - access("DL"."PICK_UP_STOP_ID"="PICKUP_STOP"."STOP_ID")
  29 - access("WDA"."DELIVERY_ID"="DL"."DELIVERY_ID")
       filter("WDA"."DELIVERY_ID" IS NOT NULL)

Yet another Query Plan Change ! Cost Based Optimization…

Optimizer relies on Statistics to compute an Optimal Plan, which includes : Choice between Index Scan or Table Scan, in case of Index Scan, which index to choose, and more importantly, the Join Order and Join Type. Therefore, it is quite evident that accurate statistics means a good and accurate estimation of Cardinality, which in turn, results in optimal execution plan. But there are cases, when accurate statistics is not good enough for an Optimizer to come out with an Optimal plan. In this blog, I will post a real life issue which is an interesting case, in which, the Optimizer generated plan, impacted the performance of a query whereas the underlying object statistics were accurate. Accurate statistics means the the statistics that reflect the accurate picture of the data in the underlying table.

In many of the events that I presented, one of the common statement that I emphasize is “Differentiate Performance Symptom and Performance Problem”. In this case too, the inhouse team was investigating and focussing more on the Symptom while the problem was somewhere else. We will walk through this as well.

Recently, at a customer site, the performance of one of a critical performance, that runs every few minutes, got impacted. The inhouse adminitrator time could investigate that execution time of one particular query, that usually takes less than a second, changed to 30-40 minutes. Fortunately, the customer maintains Query Baselines and this helped them further delve into the issue and they observed that the plan was changed. While they were investigating the issue and the reason for this plan change, the issue was temporarily fixed using SQL Plan Management. The customer wanted to know the reason for this behaviour, so as to avoid, this in future.

This was a custom process. The original query and the good / bad plan is pasted below.

SELECT WIP_ENTITY_NAME FROM WIP_DJ_CLOSE_TEMP 
WHERE	WIP_ENTITY_ID IN
	(SELECT wdct.WIP_ENTITY_ID            
	FROM	WIP_TRANSACTIONS wt,
		WIP_DJ_CLOSE_TEMP wdct             
	WHERE	wdct.GROUP_ID = :b1             
	AND	wdct.ORGANIZATION_ID = :b2
	AND	wdct.WIP_ENTITY_ID = wt.WIP_ENTITY_ID             
	AND	wt.ORGANIZATION_ID = :b3             
	AND	wt.TRANSACTION_DATE > wdct.ACTUAL_CLOSE_DATE   
	UNION    
	SELECT wdct.WIP_ENTITY_ID            
	FROM	MTL_MATERIAL_TRANSACTIONS mmt,
		WIP_DJ_CLOSE_TEMP wdct             
	WHERE	wdct.GROUP_ID = :b1             
	AND	wdct.ORGANIZATION_ID = :b2
	AND	wdct.WIP_ENTITY_ID = mmt.TRANSACTION_SOURCE_ID            
	AND	mmt.TRANSACTION_SOURCE_TYPE_ID = 5             
	AND	mmt.ORGANIZATION_ID= :b3
	AND mmt.TRANSACTION_DATE > wdct.ACTUAL_CLOSE_DATE)

Good Plan

Plan hash value: 2500423228

----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name                         | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |                              |       |       |  6017 (100)|          |       |       |
|   1 |  NESTED LOOPS                           |                              |       |       |            |          |       |       |
|   2 |   NESTED LOOPS                          |                              |    70 |  1960 |  6017   (1)| 00:00:36 |       |       |
|   3 |    VIEW                                 | VW_NSO_1                     |    70 |   910 |  5947   (1)| 00:00:35 |       |       |
|   4 |     SORT UNIQUE                         |                              |    70 |  3040 |  5947  (92)| 00:00:35 |       |       |
|   5 |      UNION-ALL                          |                              |       |       |            |          |       |       |
|   6 |       NESTED LOOPS                      |                              |       |       |            |          |       |       |
|   7 |        NESTED LOOPS                     |                              |    45 |  1890 |   490   (1)| 00:00:03 |       |       |
|*  8 |         TABLE ACCESS BY INDEX ROWID     | WIP_DJ_CLOSE_TEMP            |    80 |  1920 |     9   (0)| 00:00:01 |       |       |
|*  9 |          INDEX RANGE SCAN               | WIP_DJ_CLOSE_TEMP_N1         |   797 |       |     2   (0)| 00:00:01 |       |       |
|* 10 |         INDEX RANGE SCAN                | WIP_TRANSACTIONS_N1          |    24 |       |     0   (0)|          |       |       |
|* 11 |        TABLE ACCESS BY INDEX ROWID      | WIP_TRANSACTIONS             |     1 |    18 |     6   (0)| 00:00:01 |       |       |
|  12 |       NESTED LOOPS                      |                              |       |       |            |          |       |       |
|  13 |        NESTED LOOPS                     |                              |    25 |  1150 |  5455   (1)| 00:00:32 |       |       |
|* 14 |         TABLE ACCESS BY INDEX ROWID     | WIP_DJ_CLOSE_TEMP            |    80 |  1920 |     9   (0)| 00:00:01 |       |       |
|* 15 |          INDEX RANGE SCAN               | WIP_DJ_CLOSE_TEMP_N1         |   797 |       |     2   (0)| 00:00:01 |       |       |
|  16 |         PARTITION RANGE ITERATOR        |                              |     2 |       |    67   (0)| 00:00:01 |   KEY |    36 |
|* 17 |          INDEX RANGE SCAN               | MTL_MATERIAL_TRANSACTIONS_N2 |     2 |       |    67   (0)| 00:00:01 |   KEY |    36 |
|* 18 |        TABLE ACCESS BY LOCAL INDEX ROWID| MTL_MATERIAL_TRANSACTIONS    |     1 |    22 |    68   (0)| 00:00:01 |     1 |     1 |
|* 19 |    INDEX UNIQUE SCAN                    | WIP_DJ_CLOSE_TEMP_U1         |     1 |       |     0   (0)|          |       |       |
|  20 |   TABLE ACCESS BY INDEX ROWID           | WIP_DJ_CLOSE_TEMP            |     1 |    15 |     1   (0)| 00:00:01 |       |       |
----------------------------------------------------------------------------------------------------------------------------------------

Bad Plan

Plan hash value: 821054275

----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name                         | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |                              |       |       |  7295 (100)|          |       |       |
|   1 |  NESTED LOOPS                           |                              |    19 |   532 |  7295   (1)| 00:00:43 |       |       |
|   2 |   VIEW                                  | VW_NSO_1                     |    19 |   247 |  7276   (1)| 00:00:43 |       |       |
|   3 |    SORT UNIQUE                          |                              |    19 |   826 |  7276  (88)| 00:00:43 |       |       |
|   4 |     UNION-ALL                           |                              |       |       |            |          |       |       |
|*  5 |      TABLE ACCESS BY INDEX ROWID        | WIP_TRANSACTIONS             |     1 |    18 |     6   (0)| 00:00:01 |       |       |
|   6 |       NESTED LOOPS                      |                              |    12 |   504 |   930   (1)| 00:00:06 |       |       |
|*  7 |        TABLE ACCESS BY INDEX ROWID      | WIP_DJ_CLOSE_TEMP            |   152 |  3648 |    16   (0)| 00:00:01 |       |       |
|*  8 |         INDEX RANGE SCAN                | WIP_DJ_CLOSE_TEMP_N1         |  1522 |       |     4   (0)| 00:00:01 |       |       |
|*  9 |        INDEX RANGE SCAN                 | WIP_TRANSACTIONS_N1          |    24 |       |     0   (0)|          |       |       |
|* 10 |      HASH JOIN                          |                              |     7 |   322 |  6344   (1)| 00:00:37 |       |       |
|* 11 |       TABLE ACCESS BY INDEX ROWID       | WIP_DJ_CLOSE_TEMP            |   152 |  3648 |    16   (0)| 00:00:01 |       |       |
|* 12 |        INDEX RANGE SCAN                 | WIP_DJ_CLOSE_TEMP_N1         |  1522 |       |     4   (0)| 00:00:01 |       |       |
|  13 |       PARTITION RANGE ALL               |                              | 25149 |   540K|  6327   (1)| 00:00:37 |     1 |    36 |
|  14 |        TABLE ACCESS BY LOCAL INDEX ROWID| MTL_MATERIAL_TRANSACTIONS    | 25149 |   540K|  6327   (1)| 00:00:37 |     1 |    36 |
|* 15 |         INDEX RANGE SCAN                | MTL_MATERIAL_TRANSACTIONS_N8 | 25149 |       |   207   (1)| 00:00:02 |     1 |    36 |
|  16 |   TABLE ACCESS BY INDEX ROWID           | WIP_DJ_CLOSE_TEMP            |     1 |    15 |     1   (0)| 00:00:01 |       |       |
|* 17 |    INDEX UNIQUE SCAN                    | WIP_DJ_CLOSE_TEMP_U1         |     1 |       |     0   (0)|          |       |       |
----------------------------------------------------------------------------------------------------------------------------------------

Before diving into the analysis, let us compare both the plans and this will be from the customer’s perspective. They compared the plans and noticed that the Tables and Indexes selected by the optimizer, in both the cases, are same except for MTL_MATERIAL_TRANSACTION Table. The good plan uses N2 Index while the bad plan uses N8. Therefore, the customer hinted the query with N2 index and fixed the plan using SPM. Once fixed, they started their investigation on the reason for this change from N2 and N8. As mentioned earlier, this was just a Symptom while the underlying problem was something else. This something triggered the change in Index Selection and we will investigate this next.

Analysis

Comparing both the plans, it was now evident that the problem is with the second part of the Query, therefore, I took this part out for investigation. This is now a simple and small query, as show below :

SELECT wdct.WIP_ENTITY_ID            
FROM	apps.MTL_MATERIAL_TRANSACTIONS mmt,
	wip.WIP_DJ_CLOSE_TEMP wdct             
WHERE	wdct.GROUP_ID = :b1             
AND	wdct.ORGANIZATION_ID = :b2
AND	wdct.WIP_ENTITY_ID = mmt.TRANSACTION_SOURCE_ID            
AND	mmt.TRANSACTION_SOURCE_TYPE_ID = 5             
AND	mmt.ORGANIZATION_ID= :b3             
AND	mmt.TRANSACTION_DATE > wdct.ACTUAL_CLOSE_DATE

The another change that I noticed here was the change of Nested Loop Join to Hash Join. The customer had a clone environment of this database which was 2 months old and I executed this query on the clone enviroment. The Query executed in less than a second on this and therefore, the investigation wa further made simpler for me. The Query plan on the clone and production is pasted below :

Production Plan : BAD
Plan hash value: 3088015387

------------------------------------------------------------------------------------
| Id  | Operation                           | Name                         | Rows  |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                              |       |
|*  1 |  HASH JOIN                          |                              |    49 |
|*  2 |   TABLE ACCESS BY INDEX ROWID       | WIP_DJ_CLOSE_TEMP            |   155 |
|*  3 |    INDEX RANGE SCAN                 | WIP_DJ_CLOSE_TEMP_N1         |  1552 |
|   4 |   PARTITION RANGE ALL               |                              | 24662 |
|   5 |    TABLE ACCESS BY LOCAL INDEX ROWID| MTL_MATERIAL_TRANSACTIONS    | 24662 |
|*  6 |     INDEX RANGE SCAN                | MTL_MATERIAL_TRANSACTIONS_N8 | 24662 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("WDCT"."WIP_ENTITY_ID"="MMT"."TRANSACTION_SOURCE_ID")
       filter("MMT"."TRANSACTION_DATE">"WDCT"."ACTUAL_CLOSE_DATE")
   2 - filter("WDCT"."ORGANIZATION_ID"=:ORGANIZATION_ID)
   3 - access("WDCT"."GROUP_ID"=:GROUP_ID)
   6 - access("MMT"."TRANSACTION_SOURCE_TYPE_ID"=5 AND "MMT"."ORGANIZATION_ID"=:ORGANIZATION_ID)

Clone DB Plan : GOOD
Plan hash value: 569619275

-----------------------------------------------------------------------------------
| Id  | Operation                          | Name                         | Rows  |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                              |       |
|   1 |  NESTED LOOPS                      |                              |       |
|   2 |   NESTED LOOPS                     |                              |    26 |
|*  3 |    TABLE ACCESS BY INDEX ROWID     | WIP_DJ_CLOSE_TEMP            |    80 |
|*  4 |     INDEX RANGE SCAN               | WIP_DJ_CLOSE_TEMP_N1         |   798 |
|   5 |    PARTITION RANGE ITERATOR        |                              |     2 |
|*  6 |     INDEX RANGE SCAN               | MTL_MATERIAL_TRANSACTIONS_N2 |     2 |
|*  7 |   TABLE ACCESS BY LOCAL INDEX ROWID| MTL_MATERIAL_TRANSACTIONS    |     1 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("WDCT"."ORGANIZATION_ID"=:ORGANIZATION_ID)
   4 - access("WDCT"."GROUP_ID"=:GROUP_ID)
   6 - access("WDCT"."WIP_ENTITY_ID"="MMT"."TRANSACTION_SOURCE_ID" AND "MMT"."ORGANIZATION_ID"=:ORGANIZATION_ID AND
              "MMT"."TRANSACTION_DATE">"WDCT"."ACTUAL_CLOSE_DATE")
   7 - filter("MMT"."TRANSACTION_SOURCE_TYPE_ID"=5)

Notice the estimated rows calculated by the optimizer for the driving Index and Table. In clone db, the optimizer says 798 rows from an Index TEMP_N1 and then 80 Rows from the Table Access component. On production, the optimizer computes 1552 rows from the driving Index TEMP_N1 and then 155 rows from the Table Access. The optimizer further computes that 155 Iteration of MTL_MATERIAL_TRANSACTION (Nested Loop) will be a costly affair and therefore opts for a Hash Join. Since the clone db was 2 months old, the statistics were also old. Once I was convinced that estimated cardinality for the driving table seems to be an issue, I imported the production stats to clone and simulated the issue. I imported the stats of only the WIP_DJ_CLOSE_TEMP table on clone and could simulate the performance impact. The Table and relevant Column Level statistics of WIP_DJ_CLOSE_TEMP is pasted below. I have removed unwanted column stats from this post. This is from the clone and production.

Clone Stats - Performance Optimal

SQL> @table_stats
Enter value for 1: WIP_DJ_CLOSE_TEMP
old   2: where table_name='&1'
new   2: where table_name='WIP_DJ_CLOSE_TEMP'

OWNER                          PAR   NUM_ROWS     BLOCKS LAST_ANAL GLO T
------------------------------ --- ---------- ---------- --------- --- -
WIP                            NO      118836        829 26-SEP-10 YES N

Elapsed: 00:00:00.23
SQL> @column_stats
Enter value for 1: WIP
old   3: where owner='&1'
new   3: where owner='WIP'
Enter value for 2: WIP_DJ_CLOSE_TEMP
old   4: and   table_name='&2'
new   4: and   table_name='WIP_DJ_CLOSE_TEMP'

COLUMN_NAME                    NUM_DISTINCT  NUM_NULLS    DENSITY HISTOGRAM
------------------------------ ------------ ---------- ---------- ---------------
GROUP_ID                                149          0 .006711409 NONE
ORGANIZATION_ID                          10          0         .1 NONE

Production Stats - Performance sub-optimal

OWNER                          PAR   NUM_ROWS     BLOCKS LAST_ANAL GLO
------------------------------ --- ---------- ---------- --------- ---
WIP                            NO      116383        829 28-NOV-10 YES

COLUMN_NAME                    NUM_DISTINCT  NUM_NULLS    DENSITY HISTOGRAM
------------------------------ ------------ ---------- ---------- ---------------
GROUP_ID                                 75          0 .013333333 NONE
ORGANIZATION_ID                          10          0         .1 NONE

Index Details of MTL_MATERIAL_TRANSACTIONS

INDEX_NAME                     COLUMN_NAME                    COLUMN_POSITION
------------------------------ ------------------------------ ---------------
MTL_MATERIAL_TRANSACTIONS_N2   TRANSACTION_SOURCE_ID                        1
                               ORGANIZATION_ID                              2
                               TRANSACTION_DATE                             3

MTL_MATERIAL_TRANSACTIONS_N8   TRANSACTION_SOURCE_TYPE_ID                   1
                               ORGANIZATION_ID                              2
                               TRANSACTION_DATE                             3
                               TRANSACTION_SOURCE_NAME                      4

Issue

As mentioned at the start of this post, while the statistics are relevant for selection between an Index Scan and Table Scan and in case of Index Scan, which Index to select; the most important factor is the Computed Cardinality of the driving table as this has an impact on the Join Order and Join Type. Further, the Join Type dictates the selection of an Index for the other tables. The difference between the Nested Loop Join and Hash Join, in terms of Index Selection is : Nested Loop Join will select rows from the driving table and then use an Index on the Join Column, if it exists (else it will be an hash join), of the second table. Therefore, in case of a good plan, with the Nested Loop Join in place, the optimizer opted for N2 Index of MTL_MATERIAL_TRANSACTION table which is an Index on Join Column. In case of Hash Join, the Optimizer will try to filter out as many rows as possible for both the tables using an Index on filter predicates and once the rows are filtered, will join these using hash function on the Join Columns. Therefore, in case of the Hash Join, the index N8 of MTL_MATERIAL_TRANSACTION is on filter column and then the rows filtered were joined using hash function on the Join Column. See the PREDICATE INFORMATION for both the Good and Bad Plans above and the Index Column details for MTL_MATERIAL_TRANSACTION pasted above.

The Calculation : Clone Old Stats

Let us get into the computed cardinality calculation for the driving table, as this had an impact on the Join Type. On clone the number of Rows are : 118836 and the WHERE predicate for this table is as under :

WHERE	wdct.GROUP_ID = :b1             
AND	wdct.ORGANIZATION_ID = :b2

The Table has an Index on GROUP_ID column and the calculation says 118836*1/149 = 797.55, rounded off to 798. Further, it has to filter based on non-indexed column, which is Organization_ID and the calculation says 798*1/10=79.8 rounded off to 80. The clone db (good) plan shows the same.

The Calculation : Production Stats

On Production, the computed cardinality calculation for the index scan step is 116383*1/75 = 1551.77, rounded off to 1552. Further, the filter based on non-indexed column is 1552*1/10 = 155.2, rounded off to 155. The production plan shows the same.

In the case of production statistics, the statistics were collected on Sunday and this triggered the change in plan. Further, these stats were accurate, even then, the optimizer came out with a plan that was sub-optimal. As mentioned, we could simulate the issue on Clone DB after importing the statistics of the driving table – WIP_DJ_CLOSE_TEMP. As a resolution and to check the impact, we modified the column level statistics of GROUP_ID column on clone db to 200 and the plan changed to good.

As the name of the driving table implies – WIP_DJ_CLOSE_TEMP, this is an interface table and the rows are deleted and inserted into this table. The table usually contain More or Less 0.1 Million Rows. Therefore, as a final action, once the column level statistics for GROUP_ID column was fixed on production, we locked the statistics of this table. This was to ensure that the plan of this query will not be impacted after periodic statistics generation and remains stable.

After manually setting the column level statistics for GROUP_ID column to 200, the plan is as under :

Plan hash value: 569619275

-----------------------------------------------------------------------------------
| Id  | Operation                          | Name                         | Rows  |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |                              |       |
|   1 |  NESTED LOOPS                      |                              |       |
|   2 |   NESTED LOOPS                     |                              |    18 |
|*  3 |    TABLE ACCESS BY INDEX ROWID     | WIP_DJ_CLOSE_TEMP            |    58 |
|*  4 |     INDEX RANGE SCAN               | WIP_DJ_CLOSE_TEMP_N1         |   582 |
|   5 |    PARTITION RANGE ITERATOR        |                              |     2 |
|*  6 |     INDEX RANGE SCAN               | MTL_MATERIAL_TRANSACTIONS_N2 |     2 |
|*  7 |   TABLE ACCESS BY LOCAL INDEX ROWID| MTL_MATERIAL_TRANSACTIONS    |     1 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("WDCT"."ORGANIZATION_ID"=:ORGANIZATION_ID)
   4 - access("WDCT"."GROUP_ID"=:GROUP_ID)
   6 - access("WDCT"."WIP_ENTITY_ID"="MMT"."TRANSACTION_SOURCE_ID" AND "MMT"."ORGANIZATION_ID"=:ORGANIZATION_ID AND
              "MMT"."TRANSACTION_DATE">"WDCT"."ACTUAL_CLOSE_DATE")
   7 - filter("MMT"."TRANSACTION_SOURCE_TYPE_ID"=5)

Production Stats - Performance sub-optimal

OWNER                          PAR   NUM_ROWS     BLOCKS LAST_ANAL GLO
------------------------------ --- ---------- ---------- --------- ---
WIP                            NO      116383        829 28-NOV-10 YES

COLUMN_NAME                    NUM_DISTINCT  NUM_NULLS    DENSITY HISTOGRAM
------------------------------ ------------ ---------- ---------- ---------------
GROUP_ID                                200          0 .013333333 NONE
ORGANIZATION_ID                          10          0         .1 NONE

The calculation now stands as Index Step : 116383*1/200 = 581.91, rounded off to 582. Table Access : 582*1/10 = 58.2, rounded off to 58. This resolved the issue.

In this scenario, Investigating the statistics on MTL_MATERIAL_TRANSACTIONS table would have resulted futile,as this change was just a symptom due to the problem with stats on the driving table.

Cardinality Feedback ! 11g New Auto Optimization feature..

Few days back, I was working on a minor performance issue post 11g upgrade. The reason I mention minor issue because the customer did extensive testing, as a result the performance and execution plan of all, except 1 or 2, queries were optimal. While I was working on the optimizer behaviour in 11g that would have caused this plan change for the two sql’s, I came across “Cardinality Feedback used for this statement” message just under the runtime plan. This was something new to me and therefore, I started my investigation on this. While I was still on the job of investigation, the onsite dba asked me whether I have made some changes at the db level. He wanted to confirm this because all of a sudden the Execution Plan for these 2 queries changed from worse to good. This was a surprise. This change in plan was due to the Cardinality Feedback, introduced in 11g.

Today, I was at another customer site for a small round of discussion when someone mentioned that post 11g Upgrade, one of their batch processes takes longer and they have found out that the plan for one main query has changed for bad. I logged in to the system and again noticed “Cardinality Feedback used for this statement” message. I made a small change in the query (this change was just a change of select to Select) so that it is hard parsed. The Query came out in less than a minute, which was the case earlier i.e 10g. I re-executed the query and this time I had to cancel it after 10 minutes. The plan changed at runtime and this change made this query change the execution time from less than a minute to more than 10 minute and still executing. This time, at this customer site, the runtime plan change was from good to bad.

What is Cardinality Feedback ?

As of now, there is not much documentation available on this optimization strategy or feature and therefore, whatever I write on this is purely a guesswork. I am still investigating on this and something, if available, will feature in my next blogs.

I assume that the strategy works on the internal mechanism of GATHER_PLAN_STATISTICS hint that calculates the Estimated Rows and Actual Rows. In this case, when the Query is Hard Parsed, the optimizer generates an execution plan based on the estimated rows. As usual, this estimation is from the object and column statistics. At the end of the query execution, the Optimizer also retains the Actual rows fetched by each of the steps. If a huge difference is observed between the two statistics (E-Rows and A-Rows), during subsequent executions of the same query, the optimizer generates another child cursor with a new execution plan.

This optimization strategy is controlled by an underscore parameter _optimizer_use_feedback that defaults to true.

Optimizer Hint that Ignores Histogram and Column Stats

There was an Internal Forum and the question raised was “is there any optimizer hint that causes the optimizer to ignore histogram information ?” The question seemed to be interesting and thought of posting my reply to the question.

CARDINALITY hint is the one that directs the Optimizer to take the value of the Number of Rows to be processed by a Rowset or a Table. This hint is particularly usefull in Versions prior to 10g due to Optimizer Calculation mismatch with dependant columns. A quick demonstration below :

  1. I created a table vivek_test with 10000 rows
  2. Gathered Stats on this table, with Histogram on TEMPORARY column
  3. Then I executed the first query without any Hint and the Optimizer accurately calculated 53 rows for TEMPORARY=’Y’
  4. Then the second query with CARDINALITY hint and the Optimizer ignored histogram and column stats
create table vivek_test as
select * from all_objects where rownum<=10000;

exec dbms_stats.gather_table_stats(user,'VIVEK_TEST',null,null,null,'FOR ALL COLUMNS SIZE 1 FOR COLUMNS TEMPORARY SIZE 10');
SQL> select temporary, count(*) from vivek_test group by temporary;

T   COUNT(*)
- ----------
Y         53
N       9947

select /*+ NO_HINT */ owner, object_name
from vivek_test
where temporary='Y';

column sql_text for a50 wrap

select sql_id, sql_text, buffer_gets, executions, version_count
from	v$sqlarea
where	sql_text like 'select /*+ NO_HINT%';

SQL_ID        SQL_TEXT                                           BUFFER_GETS EXECUTIONS VERSION_COUNT
------------- -------------------------------------------------- ----------- ---------- -------------
6xhaxy9xb82zn select /*+ NO_HINT */ owner, object_name from vive         132          1             1
              k_test where temporary='Y'


Elapsed: 00:00:00.23

select plan_table_output from table(dbms_xplan.display_cursor('6xhaxy9xb82zn'));

SQL_ID  6xhaxy9xb82zn, child number 0
-------------------------------------
select /*+ NO_HINT */ owner, object_name from vivek_test where
temporary='Y'

Plan hash value: 3275957777

--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |       |       |    85 (100)|          |
|*  1 |  TABLE ACCESS FULL| VIVEK_TEST |    53 |  1431 |    85   (2)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TEMPORARY"='Y')


19 rows selected.


select /*+ CARDINALITY(vivek_test,300) */ owner, object_name
from vivek_test
where temporary='Y';



select sql_id, sql_text, buffer_gets, executions, version_count
from	v$sqlarea
where	sql_text like 'select /*+ CARDINALITY(vivek_test,300%';

SQL_ID        SQL_TEXT                                           BUFFER_GETS EXECUTIONS VERSION_COUNT
------------- -------------------------------------------------- ----------- ---------- -------------
5xw7ckfhkz07p select /*+ CARDINALITY(vivek_test,300) */ owner, o         132          1             1
              bject_name from vivek_test where temporary='Y'


Elapsed: 00:00:00.02

select plan_table_output from table(dbms_xplan.display_cursor('5xw7ckfhkz07p'));

SQL_ID  5xw7ckfhkz07p, child number 0
-------------------------------------
select /*+ CARDINALITY(vivek_test,300) */ owner, object_name from
vivek_test where temporary='Y'

Plan hash value: 3275957777

--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |       |       |    85 (100)|          |
|*  1 |  TABLE ACCESS FULL| VIVEK_TEST |   300 |  8100 |    85   (2)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TEMPORARY"='Y')


19 rows selected.

Cost Based Optimization ! Query Performance Issue after creating a new Index…

Creating an Index to optimize performance of a Query is usually an adopted strategy by most of the Database Administrators or Application Vendors. While it may optimize the performance of the query for which it has been created, but can have an impact on other queries, as these may start picking up the newly created index and this new index may not be optimal for these queries. This is especially true where Indexing Strategy is not up to the mark. My first AIOUG Session at Hyderabad, I presented something on Proactive and Reactive Tuning. If Optimization is ignored during Application Development Life cycle, then these efforts are done in an reactive mode and this is when the database administrators or developers end up creating unwanted or more number of Indexes. With these redundant indexes in place, there is always a threat of performance impact after creating a new index and this is what transpired at one of my customer place.

Recently, the application team at one of my customer site introduced a new Batch Process that runs 2-3 times a day. After Implementation, they found the performance of this process not up-to-the mark and therefore created a new Index to optimize one of the resource intensive query of that process. After creating this new Index, the performance of the batch process improved, but few other queries started using this newly created index, with the sessions waiting on “read by other session”. Now, the customer was in a big dilemma. If they drop the new index, the batch process is impacted and if it is created, 2-3 other queries are impacted. When this was escalated to the Application Vendor, they washed off their hands stating this to be an Oracle Issue. In my opinion, this seems to be an application issue and as you read further you will come to know the reason for this, but the issue can also be attributed to Oracle Optimizer. Why the Optimizer chose a newly created index, when the performance of the query was optimal with the existing index ? Let us walk through a detailed issue and the analysis.

After Implementing a new Batch process, Query No.1 below was doing a Full Table Scan of a 80 GB table. To optimize this query and improve the completion time of the Batch process, an Index BLINE_INDX was created. There was already an Index N_T_ADVICE_DTL_IDX3 on the CASEID column as a leading column, but the optimizer chose not to use this Index, may be since this column has been used as a Range Predicate. Instead of Investigation the reason for not using this Index, the Application Vendor reverted with a new index BLINE_INDX and therefore this index was created. Once this new Index was created, the execution plan of Query No.2, which was optimally using N_T_ADVICE_DTL_IDX3 ceased to use this index and started using the New Index BLINE_INDX. While there were several more indexes on this table, I have considered only two indexes that are relevant for this posting.

Query No.1 (BLINE PROCESS)
--------------------------

SELECT ROWID,CASEID,TO_CHAR(ADVICEDATE,'YYYYMMDD'),TXNADVICEID,CHARGEID,TXNID,
TXNTYPE,ADVICETYPE,ADVICEAMT,ORIGNALAMT,STATUS,DESCRIPTION,REMARKS,BPTYPE 
FROM 	flea.N_T_ADVICE_DTL 
WHERE 	caseid BETWEEN :1 AND :2 
AND 	ADVICEDATE BETWEEN :3 AND :4 
AND 	status='A' 
ORDER BY CASEID, ADVICEDATE,  TXNADVICEID;

Query No.2
----------

SELECT SUM(txnadjustedamt) 
FROM	flea.n_t_advice_dtl 
WHERE	caseid = :b1 
AND	chargeid =93 
AND	status ='A';

Index Details
-------------

INDEX_NAME                     COLUMN_NAME                    COLUMN_POSITION
------------------------------ ------------------------------ ---------------
BLINE_INDX                     CASEID                                       1 <== Local
                               ADVICEDATE                                   2
                               STATUS                                       3

N_T_ADVICE_DTL_IDX3            CASEID                                       1 <== Local
                               CHARGEID                                     2

My services in this issue was sought when the Customer wanted the analysis on why Query 2 started picking up a new Index, while, both the columns of IDX3 index are present in Query 2 and therefore should be an obvious choice for the optimizer. This seems to be true and therefore, I started my analysis on this issue with a query to v$sql_bind_capture to get the actual production runtime value passed to the bind :1 by the application user. I executed this query and passed the bind value to a bind variable b1. The runtime plan of the query shows BLINE_INDX index being used and this matches the production issue.

SQL>@bind_capture
new   2: where sql_id='b7nkg6m692jgq'

NAME                             POSITION DATATYPE_STRING      VALUE_STRING
------------------------------ ---------- -------------------- ------------------------------
:1                                      1 NUMBER               2568494

SQL>variable b1 number;
SQL>exec :b1:=2568494;

SELECT /*+ VIVEK_2568494 */ SUM(txnadjustedamt) 
FROM	flea.n_t_advice_dtl 
WHERE	caseid = :b1 
AND	chargeid =93 
AND	status ='A';

Plan Table
============
------------------------------------------------------------------+-----------------------------------+---------------+
| Id  | Operation                            | Name               | Rows  | Bytes | Cost  | Time      | Pstart| Pstop |
------------------------------------------------------------------+-----------------------------------+---------------+
| 0   | SELECT STATEMENT                     |                    |       |       |     4 |           |       |       |
| 1   |  SORT AGGREGATE                      |                    |     1 |    16 |       |           |       |       |
| 2   |   PARTITION RANGE SINGLE             |                    |     1 |    16 |     4 |  00:00:01 | KEY   | KEY   |
| 3   |    TABLE ACCESS BY LOCAL INDEX ROWID | N_T_ADVICE_DTL     |     1 |    16 |     4 |  00:00:01 | KEY   | KEY   |
| 4   |     INDEX RANGE SCAN                 | BLINE_INDX         |    47 |       |     1 |  00:00:01 | KEY   | KEY   |
------------------------------------------------------------------+-----------------------------------+---------------+
Predicate Information:
----------------------
3 - filter("CHARGEID"=:SYS_B_0)
4 - access("CASEID"=:B1 AND "STATUS"=:SYS_B_1)
4 - filter("STATUS"=:SYS_B_1)

This being a partitioned table, the value 2568494 falls under the high_value of Partition 3 of the table. Therefore, the statistics for this partition was referred for the calculation and analysis. Shown below are the Optimizer Statistics for Partition 3 and the calculation that optimizer does to derive a cost of an Index Scan.


## Calculation for Partition 3
-------------------------------
TABLE_OWNER  PARTITION_NAME           HIGH_VALUE      NUM_ROWS     BLOCKS PARTITION_POSITION
------------ ------------------------ ------------- ---------- ---------- ------------------
FLEA         N_T_ADVICE_DTL_FO_P3     3000000         90588615    2019809                  3

Partition Column Stats
-----------------------
## Stats for Partition 3 based on Bind Capture : 2568494

COLUMN_NAME     NUM_DISTINCT  NUM_NULLS    DENSITY HISTOGRAM
--------------- ------------ ---------- ---------- ---------------
CASEID                954287          0 1.0479E-06 NONE
CHARGEID                 105          0  .00952381 NONE
STATUS                     2          0         .5 NONE

Partition Index Stats
---------------------
INDEX_NAME                PARTITION_NAME                     BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR   AVDPK      ALBKP
------------------------- ------------------------------ ---------- ----------- ------------- ----------------- ------- ----------
BLINE_INDX                N_T_ADVICE_DTL_FO_P3                    3      335600       8818906          46069060       5          1

N_T_ADVICE_DTL_IDX3       N_T_ADVICE_DTL_FO_P3                    3      293115       1638635          48089155      29          1

Once the data was collected, it was time to compute the cost of both the Indexes. The formula for computing the cost of an Index Scan and then Table Access by Index step is :

Cost of Index Scan (A) = BLEVEL + CEIL(LEAF_BLOCKS * INDEX SELECTIVITY)
Cost of Table Scan via Index (B) = CEIL(CLUSTERING_FACOR * INDEX SELECTIVITY WITH FILTERS)
Total Cost = A + B

Analysis & Cost Calculation

In case of BLINE_INDX Index, for the selectivity, only CASEID column is to be considered. Revisit the Execution Plan above and check for the Predicate Information. While CASEID and STATUS columns exists in this Index, Access_Predicate Information shows that only CASEID column is used for Index Scan and further STATUS column is used as a Filter_predicate. Therefore, Num_Distinct of CASEID has to be used to derive Index Selectivity and product of Num_Distinct of CASEID and STATUS has to be used to compute the Index Selectivity with Filters.

Similary, for N_T_ADVICE_DTL_IDX3, since both the columns present in the Index are used in the Query, the Selectivity (Index and with Filters) will have to be computed on the product of the two columns present i.e.CASEID and CHARGEID.

One point also to be noted here is that the parameter OPTIMIZER_INDEX_COST_ADJ has been set to 15 and this is also to be taken into consideration when computing the Final Cost.

Calculation For bline Index
---------------------------
select 3+ceil(335600*1/954287), ceil(46069060*1/954287*1/2) from dual;

Index Cost : 4
Table Access : 4+25 = 29

OICA = 29*15/100 = 4.35

Rounded Up to 4
----------------

For IDX3 Index
--------------

select 3+ceil(293115*1/954287*1/105), ceil(48089155*1/954287*1/105) from dual;

Index Cost : 4
Table Access : 4+1 = 5

OICA : 5*15/100 = 0.75

Rounded Up to 1
---------------

Based on this calculation, the cost of IDX3 seems to be cheaper and should have been an obvious choice. Why is Optimizer computing BLINE_INDX to be cheaper than IDX3 ? There’s another better place to compare our analysis with the Optimizer Computation and this is 10053 trace file. For this, the query needs to be hard parsed and therefore, with a minor change in the query, it was reexecuted for the same bind value and a trace file was generated.

Once the 10053 trace was generated, it was easier to get into the rationale behind Optimizer Choice of BLINE_INDX index for this query. Pasted below is the relevant content of the 10053 trace file. I have also underlined the relevant value that will be used for analysis and comparison.

Table Stats::
  Table: N_T_ADVICE_DTL  Alias: N_T_ADVICE_DTL  Partition [2]
    #Rows: 90588615  #Blks:  2019809  AvgRowLen:  164.00
Index Stats::
  Index: BLINE_INDX  Col#: 28 27 17  PARTITION [2]
    LVLS: 3  #LB: 335600  #DK: 8818906  LB/K: 1.00  DB/K: 5.00  CLUF: 46069060.00
  Index: N_T_ADVICE_DTL_IDX3  Col#: 28 3  PARTITION [2]
    LVLS: 3  #LB: 293115  #DK: 1638635  LB/K: 1.00  DB/K: 29.00  CLUF: 48089155.00
***************************************
SINGLE TABLE ACCESS PATH
  Column (#28): CASEID(NUMBER)  Part#: 2
    AvgLen: 6.00 NDV: 954287 Nulls: 0 Density: 1.0479e-06 Min: 2000000 Max: 2999998
  Column (#3): CHARGEID(NUMBER)  Part#: 2
    AvgLen: 4.00 NDV: 105 Nulls: 0 Density: 0.0095238 Min: 2 Max: 1000018
  Column (#17): STATUS(VARCHAR2)  Part#: 2
    AvgLen: 2.00 NDV: 2 Nulls: 0 Density: 0.5
  Table: N_T_ADVICE_DTL  Alias: N_T_ADVICE_DTL
    Card: Original: 90588615  Rounded: 1  Computed: 0.45  Non Adjusted: 0.45
  Access Path: TableScan
    Cost:  441312.32  Resp: 441312.32  Degree: 0
      Cost_io: 425225.00  Cost_cpu: 81419528497
      Resp_io: 425225.00  Resp_cpu: 81419528497
kkofmx: index filter:"N_T_ADVICE_DTL"."STATUS"=:B1
kkofmx: index filter:"N_T_ADVICE_DTL"."CHARGEID"=:B1 AND "N_T_ADVICE_DTL"."STATUS"=:B2
kkofmx: index filter:"N_T_ADVICE_DTL"."CASEID"=:B1 AND "N_T_ADVICE_DTL"."CHARGEID"=:B2 AND "N_T_ADVICE_DTL"."STATUS"=:B3
  Access Path: index (RangeScan)
    Index: BLINE_INDX
    resc_io: 29.00  resc_cpu: 195479
    ix_sel: 1.0703e-06  ix_sel_with_filters: 5.3516e-07
    Cost: 4.36  Resp: 4.36  Degree: 1
  Access Path: index (AllEqRange)
    Index: N_T_ADVICE_DTL_IDX3
    resc_io: 34.00  resc_cpu: 216124
    ix_sel: 6.1026e-07  ix_sel_with_filters: 6.1026e-07
    Cost: 5.11  Resp: 5.11  Degree: 1
  Best:: AccessPath: IndexRange  Index: BLINE_INDX
         Cost: 4.36  Degree: 1  Resp: 4.36  Card: 0.45  Bytes: 0

The Cost of BLINE_INDX (resc_io) matches our calculation. RESC_IO for BLINE_INDX is 29 and with OICA set to 15, the final cost is 4.36, rounded off to 4. Surprisingly, the Cost of IDX3 (resc_io), as per Optimizer is 34, as against our calculation of 5. Optimizer always chose a least costed plan and in this case, 29 against 34 and 4.36 agains 5.11 is cheaper, therefore Optimizer opted for BLINE_INDX Index. Now, the action plan was to check for the Calculation made by the Optimizer that derives the cost of IDX3 index to be 34. BLEVEL and CLUSTERING_FACTOR of the IDX3 index is unchanged and therefore, I doubted on the Selectivity part. The ix_sel (index_selectivity) and ix_sel_with_filters (Index Selectivity with Filters) for IDX3 index is 6.1026e-07. The Selectivity, as derived by me, was 1/NDV*1/NDV (for two columns). I decided to do some reverse engineering and therefore, calculated the value of 1/6.1026e-07 which is 1638645.82. Then I looked for a value closer to 1638645.92 and observed that the DISTINCT_KEYS of IDX3 is 1638635, which is very much close. Now, if you check the Index and Index with Filters Selectivity based on 1/1638635, then it exactly matches the ix_sel and ix_sel_with_filters value of 10053 trace. Further, note the line Access Path: index (AllEqRange) above Index: N_T_ADVICE_DTL_IDX3 in the trace file. AllEqRange means, all the columns of an Index has been used as an Equality predicate and therefore, instead of using individual selectivity of each columns, optimizer has considered the computed DISTINCT_KEYS of an Index and this has flawed the Optimizer Computation.

By this time, it was clear and evident that the flaw in this calculation is due to AllEqRange, it was time to implement a change that would resolve the issue. There were two solutions for this issue. These are :

  1. Modify the Clustering Factor for the problematic partitions.
  2. Add a new Column to IDX3 Index that is not used in the Problematic Query (query no.2)

I computed the cost for each of the partitions and observed that the clustering_factor for almost 70% of the index partitions will have to be modified and this means, re-calculating this everytime the stats are computed. Locking of Stats after modifying the Clustering_factor would have helped but then, Optimizer Behaviour Change and the risk of performance impact with other queries was on top of the mind and therefore, this option was not implemented. I tested this for one partition, where after modifying the Clustering_factor, the optimizer started picking up IDX3 index.

The other option was to modify the index by way of adding a new column. Again, in order to eliminate any potential risk after adding a third column, the best option implemented thought of and implemented was adding a column with single distinct value. Therefore, a column CURRENCYID with a single distinct value was appended to this index and was tested on UAT. Before making this change on the UAT, first we made sure that the issue is reproduced. Once we were sure that the similar issue exists on the UAT enviroment as well, we dropped and recreated the index on (CASEID, CHARGEID, CURRENCYID) and the change had a dramatic impact. With this change, the AllEqRange changed to RangeScan and the optimizer computed the cost based on Individual Selectivities of the two columns rather than DISTINCT_KEYS from dba_indexes. Pasted below is the relevant portion of 10053 trace generated on UAT post-index change.

Table Stats::
  Table: N_T_ADVICE_DTL  Alias: N_T_ADVICE_DTL Partition [2]
    #Rows: 25494530  #Blks:  10874  AvgRowLen:  165.00
Index Stats::
  Index: BLINE_INDX  Col#: 28 27 17  PARTITION [2]
    LVLS: 2  #LB: 92720  #DK: 8843108  LB/K: 1.00  DB/K: 1.00  CLUF: 11966280.00
  Index: N_T_ADVICE_DTL_IDX3  Col#: 28 3 18 PARTITION [2]
    LVLS: 2  #LB: 86050  #DK: 2764430  LB/K: 1.00  DB/K: 4.00  CLUF: 12732735.00
***************************************
SINGLE TABLE ACCESS PATH
  Column (#28): CASEID(NUMBER)
    AvgLen: 6.00 NDV: 406027 Nulls: 240 Density: 2.4629e-06 Min: 0 Max: 51734139
  Column (#3): CHARGEID(NUMBER)
    AvgLen: 4.00 NDV: 165 Nulls: 0 Density: 0.0060606 Min: 2 Max: 10000048
  Column (#17): STATUS(VARCHAR2)
    AvgLen: 2.00 NDV: 3 Nulls: 0 Density: 0.33333
  Table: N_T_ADVICE_DTL  Alias: N_T_ADVICE_DTL     
    Card: Original: 25494530  Rounded: 1  Computed: 0.13  Non Adjusted: 0.13
  Access Path: TableScan
    Cost:  2544.46  Resp: 2544.46  Degree: 0
      Cost_io: 2380.00  Cost_cpu: 414356148
      Resp_io: 2380.00  Resp_cpu: 414356148
kkofmx: index filter:"N_T_ADVICE_DTL"."STATUS"=:B1
kkofmx: index filter:"N_T_ADVICE_DTL"."CHARGEID"=:B1 AND "N_T_ADVICE_DTL"."STATUS"=:B2
kkofmx: index filter:"N_T_ADVICE_DTL"."CASEID"=:B1 AND "N_T_ADVICE_DTL"."CHARGEID"=:B2 AND "N_T_ADVICE_DTL"."STATUS"=:B3
  Access Path: index (RangeScan)
    Index: BLINE_INDX
    resc_io: 14.00  resc_cpu: 127837
    ix_sel: 2.5335e-06  ix_sel_with_filters: 8.4450e-07
    Cost: 2.11  Resp: 2.11  Degree: 1
  Access Path: index (RangeScan)
    Index: N_T_ADVICE_DTL_IDX3
    resc_io: 4.00  resc_cpu: 29395
    ix_sel: 1.4926e-08  ix_sel_with_filters: 1.4926e-08
    Cost: 1.00  Resp: 1.00  Degree: 1
  Best:: AccessPath: IndexRange  Index: N_T_ADVICE_DTL_IDX3
         Cost: 1.00  Degree: 1  Resp: 1.00  Card: 0.13  Bytes: 0

 
============
Plan Table
============
-----------------------------------------------------------------------+-----------------------------------+---------------+
| Id  | Operation                            | Name                    | Rows  | Bytes | Cost  | Time      | Pstart| Pstop |
-----------------------------------------------------------------------+-----------------------------------+---------------+
| 0   | SELECT STATEMENT                     |                         |       |       |     1 |           |       |       |
| 1   |  SORT AGGREGATE                      |                         |     1 |    15 |       |           |       |       |
| 2   |   PARTITION RANGE SINGLE             |                         |     1 |    15 |     1 |  00:00:01 | KEY   | KEY   |
| 3   |    TABLE ACCESS BY LOCAL INDEX ROWID | N_T_ADVICE_DTL          |     1 |    15 |     1 |  00:00:01 | KEY   | KEY   |
| 4   |     INDEX RANGE SCAN                 | N_T_ADVICE_DTL_IDX3     |     1 |       |     1 |  00:00:01 | KEY   | KEY   |
-----------------------------------------------------------------------+-----------------------------------+---------------+
Predicate Information:
----------------------
3 - filter("STATUS"=:SYS_B_1)
4 - access("CASEID"=:B1 AND "CHARGEID"=:SYS_B_0)

The change is still pending for Implementation on production and will be done post month end. With the analysis, it seems that the issue should resolve post this change. If you have come across any such issues, please do share the findings. For the other 2 queries, I do have an interesting fact about Cost Based Optimizer, but unfortunately, have not collected artifacts and data to post on the blog. While the change in BLINE_INDX Index suggested by me has resolved these issues as well, for posting purpose, I am doing further study on this and will share the findings as and when I am ready with it.

Benefits of SQL Baseline ! Yet another performance issue post 10g Upgrade

First, let me apologies for a brief pause from blogging. I was out of action for most of these days and as I always maintain, I write only if I am active and have something to share with Oracle Community.

Few years back, I had suggested a customer to maintain SQL Baseline that collects the execution plan and other details like SQL_TEXT etc. While the customer had scheduled a job that periodically collects these informations into some tables, these collected information were never used, infact, a need never arouse to get into the collected information. Finally, post upgrade of this database to 10g, a severe performance degradation was reported and the information collected using SQL Baseline came handy. Usually, before upgrading a database to 10g, I strongly recommend performance testing, but in this case, the customer went ahead with the production upgrade with minimal testing.

Another challenge in this Upgrade was the Stats gathering policy. During the pre-upgrade discussion, the Customer mentioned :

  1. Stats for very few tables are generated periodically,
  2. Stats for most of the tables are stale and if gathered, impacts the performance,
  3. Stats for many of the tables are not collected at all.

Performance issues, with very minimal testing and Optimizer Stats issues, was inevitable and customer was well aware of this fact. However, post-upgrade the impact was unexpectedly very severe and the database was almost unworkable. As a result, and as expected in a firefighting mode, various optimization techniques were ineffectively implemented. At this point of time, our investigation started.

As mentioned earlier, the challenge here was unavailability of test results from 9i instance. There were significant waits on CBC Latches and "read by other session" and one common table used in these impacted queries (almost in 90% of the queries) was TH_DBF. During my investigation, I concentrated on one top query that was executed multiple times and was waiting on CBC latch for most of the time. The text of the query is as mentioned below, alongwith the execution plan from 10g.

select T1.TIMESTAMP, T1.M_IDENTITY, T1.M_TRN_GRP, T1.M_TRN_TYPO, T1.M_BINTERNAL, T1.M_BENTITY, 
T1.M_SINTERNAL, T1.M_SENTITY, T1.M_TRN_STATUS, T1.M_TRN_EXP, T1.M_VAL_STATUS, 
T1.M_MOP_LAST,T2.TIMESTAMP, T2.M_IDENTITY, T2.M_NB, T2.M_LINE, T2.M_SRC_PFOLIO, T2.M_DST_PFOLIO, 
T2.M_CNTRP, T2.M_FEE, T2.M_CUR, T2.M_CODE, T2.M_AUTF, T2.M_AUTS, T2.M_TYPE, 
T2.M_VALUE_TYPE, T2.M_VALUE_DATE, T2.M_AVP 
from 	TH_DBF T1,
	TB_DBF T2  
where 	(T1.M_NB   =  T2.M_NB ) 
and 	((((((((T1.M_BINTERNAL like 'N%') or (T1.M_SINTERNAL like 'N%')) 
and 	(T1.M_TRN_EXP>=to_date('20100105','YYYYMMDD'))) 
and 	(((T1.M_VAL_STATUS'MVAL') and (T1.M_VAL_STATUS'MCON')) 
and 	(T1.M_VAL_STATUS'MCAP'))) and (T1.M_MOP_LAST3)) 
and 	(T1.M_TRN_STATUS'DEAD')) and (T1.M_TRN_GRP like 'FXD%')) 
or 	((((T1.M_BENTITY'INDIA MUM') and (T1.M_SENTITY'INDIA MUM')) 
and 	(instr(T1.M_TRN_TYPO, 'PAYCUR') >0 )) 
and 	(T1.M_TRN_EXP>to_date('20100105','YYYYMMDD'))));

Plan hash value: 2180463251

-----------------------------------------------------------------------------
| Id  | Operation          | Name           | Rows  | Bytes |TempSpc| Cost  |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                |       |       |       | 52386 |
|*  1 |  HASH JOIN         |                |   367K|    50M|  9512K| 52386 |
|*  2 |   TABLE ACCESS FULL| TH_DBF         | 91836 |  8430K|       | 46682 |
|   3 |   TABLE ACCESS FULL| TB_DBF         |  2541K|   121M|       |  3161 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."M_NB"="T2"."M_NB")
   2 - filter(((("T1"."M_BINTERNAL" LIKE :SYS_B_00 OR
              "T1"."M_SINTERNAL" LIKE :SYS_B_01) AND "T1"."M_TRN_EXP">=:SYS_B_02 AND
              "T1"."M_VAL_STATUS":SYS_B_03 AND "T1"."M_VAL_STATUS":SYS_B_04 AND
              "T1"."M_VAL_STATUS":SYS_B_05 AND "T1"."M_MOP_LAST":SYS_B_06 AND
              "T1"."M_TRN_STATUS":SYS_B_07 AND "T1"."M_TRN_GRP" LIKE :SYS_B_08) OR
              ("T1"."M_BENTITY":SYS_B_09 AND "T1"."M_SENTITY":SYS_B_10 AND
              INSTR("T1"."M_TRN_TYPO",:SYS_B_11)>:SYS_B_12 AND
              "T1"."M_TRN_EXP">:SYS_B_13)))

SQL Baseline

Fortunately, the customer had maintained SQL Baseline for all the queries and the good plan from 9i was found in these baseline tables. In 9i, thanks to Thomas Kyte, following tables are critical and should be maintained on a regular basis :

create table sqlarea_baseline as select hash_value, sql_text, executions, buffer_gets from v$sqlarea;
create index sqlarea_baseline_idx on sqlarea_baseline(hash_value);

insert into sqlarea_baseline
select hash_value, sql_text, executions, buffer_gets from v$sqlarea a
where not exists
(select 'x' from sqlarea_baseline b where b.hash_value = a.hash_value);

create table sqlplan_baseline as select * from v$sql_plan;
create index sqlplan_baseline_idx on sqlplan_baseline(hash_value);

insert into sqlplan_baseline
select * from v$sql_plan a
where not exists
(select 'x' from sqlplan_baseline b where b.hash_value = a.hash_value);

The data collected using above scripts came handy and we could get the plan of this SQL from 9i. The plan from 9i is pasted below :

9i Plan
-------
--------------------------------------------------------------------------------
| Operation                      | PHV/Object Name     |
Rows | Bytes|   Cost |
--------------------------------------------------------------------------------
|SELECT STATEMENT                |----- 4101123407 ----|       |      |  48785 |
| NESTED LOOPS                   |                     |    28K|    3M|  48785 |
|  TABLE ACCESS FULL             |TH_DBF               |     7K|  643K|  46682 |
|  INDEX RANGE SCAN              |TB_ND1               |     4 |      |      2 |
--------------------------------------------------------------------------------

In both the cases, the optimizer has opted for a Full Table Scan of T_DBF table. The difference is Nested Loop in 9i against Hash Join and FTS of the second table in 10g. Now the investigation was restricted to check for the change that has triggered this HASH Join. If you notice the number of rows, as calculated by the optimizer, returned by the FULL Table Scan of TH_DBF, there is a vast difference in the calculated cardinality by the two versions. Optimizer from 9i computes 7K rows, whereas, it is 91K in 10g. The Optimizer choice of Hash Join is well justified, if the rows returned by the initial step is huge. Just to check whether the plan changes by passing 7k as a cardinality of the Driving Step, I executed the query with CARDINALITY(t1,7000) hint and as expected, the plan from 9i was restored i.e.changed from Hash Join to Nested Loop, also, the response time of the query improved as well.

Now, since the problem was clear and known, detailed investigation in the optimizer calculation was required. Modifying the query with CARDINALITY hint was not possible, as it required change at the application level and could have taken time. Modifying the statistics was also not feasible as this table is the heart of the application and modifying the stats may impact other queries. Therefore, the only choice was to investigate and fix the issue.

INVESTIGATION

With the analysis mentioned above, it was evident that Optimizer is miscomputing the expected Cardinality of the row source generated by the driving step. Therefore, the columns of TH_DBF table used in the query were looked upon and one such column is M_TRN_EXP. This is a date column. Whenever this query is executed, the value passed, at runtime, to this column is SYSDATE. As a result, the number of rows returned by this query is less than 5000 rows. This was validated after executing the query with CARDINALITY hint. With 5000 rows, and 91k prediction by the optimizer, I suspected something amiss with the Column level statistics and therefore, I got into these details.

Using dbms_stats.convert_raw_value function, I could convert the raw value data stored in Low_Value and High_Value of dba_tab_columns. Surprisingly, the result after converting these is as under (the date is in YYYYMMDD format) :

COLUMN_NAME                    NUM_DISTINCT GET_DATE GET_DATE
------------------------------ ------------ -------- --------
M_TRN_EXP                              4355 19810101 20660901

With M_TRN_EXP>sysdate, the optimizer assumes that it has to fetch the data of 56 years out of 85 Years, which is nearly 65%, and this calculation is enough to misguide the optimizer computation. I have always maintained, that Optimizer is a piece of Code and it has to come out with a result based on the input given by us, in the form of statistics. There were only 4 rows with dates pertaining to 1966 and optimizer was not aware of this fact.

SOLUTION

Now, it was quite evident that this is purely an application issue and the Data entered into the application table. The customer requested for a immediate fix to this issue and the solutions suggested were :

  1. Rectifying the data : Not possible immediately
  2. Modifying Column Level Statistics (High Value)
  3. Generating Histograms on this column

Just to make sure that the Column Level Statistics, due to future date data entered, is a problem, we modified the statistics for M_TRN_EXP column and changed the High_Value to 20111231 and the query plan changed for good (similar to 9i). Therefore, a better approach discussed was to generate histogram on this column. Once the histogram was generated, the query response time improved and the plan from 9i was restored. Surprisingly, with this change, the overall performance of the database improved drastically, as most of the queries executed are on this table and all the queries have similar logic (M_TRN_EXP).

%d bloggers like this: