Cost Based Optimization ! Query Performance Issue after creating a new Index…

Creating an Index to optimize performance of a Query is usually an adopted strategy by most of the Database Administrators or Application Vendors. While it may optimize the performance of the query for which it has been created, but can have an impact on other queries, as these may start picking up the newly created index and this new index may not be optimal for these queries. This is especially true where Indexing Strategy is not up to the mark. My first AIOUG Session at Hyderabad, I presented something on Proactive and Reactive Tuning. If Optimization is ignored during Application Development Life cycle, then these efforts are done in an reactive mode and this is when the database administrators or developers end up creating unwanted or more number of Indexes. With these redundant indexes in place, there is always a threat of performance impact after creating a new index and this is what transpired at one of my customer place.

Recently, the application team at one of my customer site introduced a new Batch Process that runs 2-3 times a day. After Implementation, they found the performance of this process not up-to-the mark and therefore created a new Index to optimize one of the resource intensive query of that process. After creating this new Index, the performance of the batch process improved, but few other queries started using this newly created index, with the sessions waiting on “read by other session”. Now, the customer was in a big dilemma. If they drop the new index, the batch process is impacted and if it is created, 2-3 other queries are impacted. When this was escalated to the Application Vendor, they washed off their hands stating this to be an Oracle Issue. In my opinion, this seems to be an application issue and as you read further you will come to know the reason for this, but the issue can also be attributed to Oracle Optimizer. Why the Optimizer chose a newly created index, when the performance of the query was optimal with the existing index ? Let us walk through a detailed issue and the analysis.

After Implementing a new Batch process, Query No.1 below was doing a Full Table Scan of a 80 GB table. To optimize this query and improve the completion time of the Batch process, an Index BLINE_INDX was created. There was already an Index N_T_ADVICE_DTL_IDX3 on the CASEID column as a leading column, but the optimizer chose not to use this Index, may be since this column has been used as a Range Predicate. Instead of Investigation the reason for not using this Index, the Application Vendor reverted with a new index BLINE_INDX and therefore this index was created. Once this new Index was created, the execution plan of Query No.2, which was optimally using N_T_ADVICE_DTL_IDX3 ceased to use this index and started using the New Index BLINE_INDX. While there were several more indexes on this table, I have considered only two indexes that are relevant for this posting.


WHERE 	caseid BETWEEN :1 AND :2 
AND 	status='A' 

Query No.2

SELECT SUM(txnadjustedamt) 
FROM	flea.n_t_advice_dtl 
WHERE	caseid = :b1 
AND	chargeid =93 
AND	status ='A';

Index Details

INDEX_NAME                     COLUMN_NAME                    COLUMN_POSITION
------------------------------ ------------------------------ ---------------
BLINE_INDX                     CASEID                                       1 <== Local
                               ADVICEDATE                                   2
                               STATUS                                       3

N_T_ADVICE_DTL_IDX3            CASEID                                       1 <== Local
                               CHARGEID                                     2

My services in this issue was sought when the Customer wanted the analysis on why Query 2 started picking up a new Index, while, both the columns of IDX3 index are present in Query 2 and therefore should be an obvious choice for the optimizer. This seems to be true and therefore, I started my analysis on this issue with a query to v$sql_bind_capture to get the actual production runtime value passed to the bind :1 by the application user. I executed this query and passed the bind value to a bind variable b1. The runtime plan of the query shows BLINE_INDX index being used and this matches the production issue.

new   2: where sql_id='b7nkg6m692jgq'

NAME                             POSITION DATATYPE_STRING      VALUE_STRING
------------------------------ ---------- -------------------- ------------------------------
:1                                      1 NUMBER               2568494

SQL>variable b1 number;
SQL>exec :b1:=2568494;

SELECT /*+ VIVEK_2568494 */ SUM(txnadjustedamt) 
FROM	flea.n_t_advice_dtl 
WHERE	caseid = :b1 
AND	chargeid =93 
AND	status ='A';

Plan Table
| Id  | Operation                            | Name               | Rows  | Bytes | Cost  | Time      | Pstart| Pstop |
| 0   | SELECT STATEMENT                     |                    |       |       |     4 |           |       |       |
| 1   |  SORT AGGREGATE                      |                    |     1 |    16 |       |           |       |       |
| 2   |   PARTITION RANGE SINGLE             |                    |     1 |    16 |     4 |  00:00:01 | KEY   | KEY   |
| 3   |    TABLE ACCESS BY LOCAL INDEX ROWID | N_T_ADVICE_DTL     |     1 |    16 |     4 |  00:00:01 | KEY   | KEY   |
| 4   |     INDEX RANGE SCAN                 | BLINE_INDX         |    47 |       |     1 |  00:00:01 | KEY   | KEY   |
Predicate Information:
3 - filter("CHARGEID"=:SYS_B_0)
4 - access("CASEID"=:B1 AND "STATUS"=:SYS_B_1)
4 - filter("STATUS"=:SYS_B_1)

This being a partitioned table, the value 2568494 falls under the high_value of Partition 3 of the table. Therefore, the statistics for this partition was referred for the calculation and analysis. Shown below are the Optimizer Statistics for Partition 3 and the calculation that optimizer does to derive a cost of an Index Scan.

## Calculation for Partition 3
------------ ------------------------ ------------- ---------- ---------- ------------------
FLEA         N_T_ADVICE_DTL_FO_P3     3000000         90588615    2019809                  3

Partition Column Stats
## Stats for Partition 3 based on Bind Capture : 2568494

--------------- ------------ ---------- ---------- ---------------
CASEID                954287          0 1.0479E-06 NONE
CHARGEID                 105          0  .00952381 NONE
STATUS                     2          0         .5 NONE

Partition Index Stats
------------------------- ------------------------------ ---------- ----------- ------------- ----------------- ------- ----------
BLINE_INDX                N_T_ADVICE_DTL_FO_P3                    3      335600       8818906          46069060       5          1

N_T_ADVICE_DTL_IDX3       N_T_ADVICE_DTL_FO_P3                    3      293115       1638635          48089155      29          1

Once the data was collected, it was time to compute the cost of both the Indexes. The formula for computing the cost of an Index Scan and then Table Access by Index step is :

Total Cost = A + B

Analysis & Cost Calculation

In case of BLINE_INDX Index, for the selectivity, only CASEID column is to be considered. Revisit the Execution Plan above and check for the Predicate Information. While CASEID and STATUS columns exists in this Index, Access_Predicate Information shows that only CASEID column is used for Index Scan and further STATUS column is used as a Filter_predicate. Therefore, Num_Distinct of CASEID has to be used to derive Index Selectivity and product of Num_Distinct of CASEID and STATUS has to be used to compute the Index Selectivity with Filters.

Similary, for N_T_ADVICE_DTL_IDX3, since both the columns present in the Index are used in the Query, the Selectivity (Index and with Filters) will have to be computed on the product of the two columns present i.e.CASEID and CHARGEID.

One point also to be noted here is that the parameter OPTIMIZER_INDEX_COST_ADJ has been set to 15 and this is also to be taken into consideration when computing the Final Cost.

Calculation For bline Index
select 3+ceil(335600*1/954287), ceil(46069060*1/954287*1/2) from dual;

Index Cost : 4
Table Access : 4+25 = 29

OICA = 29*15/100 = 4.35

Rounded Up to 4

For IDX3 Index

select 3+ceil(293115*1/954287*1/105), ceil(48089155*1/954287*1/105) from dual;

Index Cost : 4
Table Access : 4+1 = 5

OICA : 5*15/100 = 0.75

Rounded Up to 1

Based on this calculation, the cost of IDX3 seems to be cheaper and should have been an obvious choice. Why is Optimizer computing BLINE_INDX to be cheaper than IDX3 ? There’s another better place to compare our analysis with the Optimizer Computation and this is 10053 trace file. For this, the query needs to be hard parsed and therefore, with a minor change in the query, it was reexecuted for the same bind value and a trace file was generated.

Once the 10053 trace was generated, it was easier to get into the rationale behind Optimizer Choice of BLINE_INDX index for this query. Pasted below is the relevant content of the 10053 trace file. I have also underlined the relevant value that will be used for analysis and comparison.

Table Stats::
  Table: N_T_ADVICE_DTL  Alias: N_T_ADVICE_DTL  Partition [2]
    #Rows: 90588615  #Blks:  2019809  AvgRowLen:  164.00
Index Stats::
  Index: BLINE_INDX  Col#: 28 27 17  PARTITION [2]
    LVLS: 3  #LB: 335600  #DK: 8818906  LB/K: 1.00  DB/K: 5.00  CLUF: 46069060.00
  Index: N_T_ADVICE_DTL_IDX3  Col#: 28 3  PARTITION [2]
    LVLS: 3  #LB: 293115  #DK: 1638635  LB/K: 1.00  DB/K: 29.00  CLUF: 48089155.00
  Column (#28): CASEID(NUMBER)  Part#: 2
    AvgLen: 6.00 NDV: 954287 Nulls: 0 Density: 1.0479e-06 Min: 2000000 Max: 2999998
  Column (#3): CHARGEID(NUMBER)  Part#: 2
    AvgLen: 4.00 NDV: 105 Nulls: 0 Density: 0.0095238 Min: 2 Max: 1000018
  Column (#17): STATUS(VARCHAR2)  Part#: 2
    AvgLen: 2.00 NDV: 2 Nulls: 0 Density: 0.5
    Card: Original: 90588615  Rounded: 1  Computed: 0.45  Non Adjusted: 0.45
  Access Path: TableScan
    Cost:  441312.32  Resp: 441312.32  Degree: 0
      Cost_io: 425225.00  Cost_cpu: 81419528497
      Resp_io: 425225.00  Resp_cpu: 81419528497
kkofmx: index filter:"N_T_ADVICE_DTL"."STATUS"=:B1
kkofmx: index filter:"N_T_ADVICE_DTL"."CHARGEID"=:B1 AND "N_T_ADVICE_DTL"."STATUS"=:B2
  Access Path: index (RangeScan)
    Index: BLINE_INDX
    resc_io: 29.00  resc_cpu: 195479
    ix_sel: 1.0703e-06  ix_sel_with_filters: 5.3516e-07
    Cost: 4.36  Resp: 4.36  Degree: 1
  Access Path: index (AllEqRange)
    Index: N_T_ADVICE_DTL_IDX3
    resc_io: 34.00  resc_cpu: 216124
    ix_sel: 6.1026e-07  ix_sel_with_filters: 6.1026e-07
    Cost: 5.11  Resp: 5.11  Degree: 1
  Best:: AccessPath: IndexRange  Index: BLINE_INDX
         Cost: 4.36  Degree: 1  Resp: 4.36  Card: 0.45  Bytes: 0

The Cost of BLINE_INDX (resc_io) matches our calculation. RESC_IO for BLINE_INDX is 29 and with OICA set to 15, the final cost is 4.36, rounded off to 4. Surprisingly, the Cost of IDX3 (resc_io), as per Optimizer is 34, as against our calculation of 5. Optimizer always chose a least costed plan and in this case, 29 against 34 and 4.36 agains 5.11 is cheaper, therefore Optimizer opted for BLINE_INDX Index. Now, the action plan was to check for the Calculation made by the Optimizer that derives the cost of IDX3 index to be 34. BLEVEL and CLUSTERING_FACTOR of the IDX3 index is unchanged and therefore, I doubted on the Selectivity part. The ix_sel (index_selectivity) and ix_sel_with_filters (Index Selectivity with Filters) for IDX3 index is 6.1026e-07. The Selectivity, as derived by me, was 1/NDV*1/NDV (for two columns). I decided to do some reverse engineering and therefore, calculated the value of 1/6.1026e-07 which is 1638645.82. Then I looked for a value closer to 1638645.92 and observed that the DISTINCT_KEYS of IDX3 is 1638635, which is very much close. Now, if you check the Index and Index with Filters Selectivity based on 1/1638635, then it exactly matches the ix_sel and ix_sel_with_filters value of 10053 trace. Further, note the line Access Path: index (AllEqRange) above Index: N_T_ADVICE_DTL_IDX3 in the trace file. AllEqRange means, all the columns of an Index has been used as an Equality predicate and therefore, instead of using individual selectivity of each columns, optimizer has considered the computed DISTINCT_KEYS of an Index and this has flawed the Optimizer Computation.

By this time, it was clear and evident that the flaw in this calculation is due to AllEqRange, it was time to implement a change that would resolve the issue. There were two solutions for this issue. These are :

  1. Modify the Clustering Factor for the problematic partitions.
  2. Add a new Column to IDX3 Index that is not used in the Problematic Query (query no.2)

I computed the cost for each of the partitions and observed that the clustering_factor for almost 70% of the index partitions will have to be modified and this means, re-calculating this everytime the stats are computed. Locking of Stats after modifying the Clustering_factor would have helped but then, Optimizer Behaviour Change and the risk of performance impact with other queries was on top of the mind and therefore, this option was not implemented. I tested this for one partition, where after modifying the Clustering_factor, the optimizer started picking up IDX3 index.

The other option was to modify the index by way of adding a new column. Again, in order to eliminate any potential risk after adding a third column, the best option implemented thought of and implemented was adding a column with single distinct value. Therefore, a column CURRENCYID with a single distinct value was appended to this index and was tested on UAT. Before making this change on the UAT, first we made sure that the issue is reproduced. Once we were sure that the similar issue exists on the UAT enviroment as well, we dropped and recreated the index on (CASEID, CHARGEID, CURRENCYID) and the change had a dramatic impact. With this change, the AllEqRange changed to RangeScan and the optimizer computed the cost based on Individual Selectivities of the two columns rather than DISTINCT_KEYS from dba_indexes. Pasted below is the relevant portion of 10053 trace generated on UAT post-index change.

Table Stats::
  Table: N_T_ADVICE_DTL  Alias: N_T_ADVICE_DTL Partition [2]
    #Rows: 25494530  #Blks:  10874  AvgRowLen:  165.00
Index Stats::
  Index: BLINE_INDX  Col#: 28 27 17  PARTITION [2]
    LVLS: 2  #LB: 92720  #DK: 8843108  LB/K: 1.00  DB/K: 1.00  CLUF: 11966280.00
  Index: N_T_ADVICE_DTL_IDX3  Col#: 28 3 18 PARTITION [2]
    LVLS: 2  #LB: 86050  #DK: 2764430  LB/K: 1.00  DB/K: 4.00  CLUF: 12732735.00
  Column (#28): CASEID(NUMBER)
    AvgLen: 6.00 NDV: 406027 Nulls: 240 Density: 2.4629e-06 Min: 0 Max: 51734139
  Column (#3): CHARGEID(NUMBER)
    AvgLen: 4.00 NDV: 165 Nulls: 0 Density: 0.0060606 Min: 2 Max: 10000048
  Column (#17): STATUS(VARCHAR2)
    AvgLen: 2.00 NDV: 3 Nulls: 0 Density: 0.33333
  Table: N_T_ADVICE_DTL  Alias: N_T_ADVICE_DTL     
    Card: Original: 25494530  Rounded: 1  Computed: 0.13  Non Adjusted: 0.13
  Access Path: TableScan
    Cost:  2544.46  Resp: 2544.46  Degree: 0
      Cost_io: 2380.00  Cost_cpu: 414356148
      Resp_io: 2380.00  Resp_cpu: 414356148
kkofmx: index filter:"N_T_ADVICE_DTL"."STATUS"=:B1
kkofmx: index filter:"N_T_ADVICE_DTL"."CHARGEID"=:B1 AND "N_T_ADVICE_DTL"."STATUS"=:B2
  Access Path: index (RangeScan)
    Index: BLINE_INDX
    resc_io: 14.00  resc_cpu: 127837
    ix_sel: 2.5335e-06  ix_sel_with_filters: 8.4450e-07
    Cost: 2.11  Resp: 2.11  Degree: 1
  Access Path: index (RangeScan)
    Index: N_T_ADVICE_DTL_IDX3
    resc_io: 4.00  resc_cpu: 29395
    ix_sel: 1.4926e-08  ix_sel_with_filters: 1.4926e-08
    Cost: 1.00  Resp: 1.00  Degree: 1
  Best:: AccessPath: IndexRange  Index: N_T_ADVICE_DTL_IDX3
         Cost: 1.00  Degree: 1  Resp: 1.00  Card: 0.13  Bytes: 0

Plan Table
| Id  | Operation                            | Name                    | Rows  | Bytes | Cost  | Time      | Pstart| Pstop |
| 0   | SELECT STATEMENT                     |                         |       |       |     1 |           |       |       |
| 1   |  SORT AGGREGATE                      |                         |     1 |    15 |       |           |       |       |
| 2   |   PARTITION RANGE SINGLE             |                         |     1 |    15 |     1 |  00:00:01 | KEY   | KEY   |
| 3   |    TABLE ACCESS BY LOCAL INDEX ROWID | N_T_ADVICE_DTL          |     1 |    15 |     1 |  00:00:01 | KEY   | KEY   |
| 4   |     INDEX RANGE SCAN                 | N_T_ADVICE_DTL_IDX3     |     1 |       |     1 |  00:00:01 | KEY   | KEY   |
Predicate Information:
3 - filter("STATUS"=:SYS_B_1)
4 - access("CASEID"=:B1 AND "CHARGEID"=:SYS_B_0)

The change is still pending for Implementation on production and will be done post month end. With the analysis, it seems that the issue should resolve post this change. If you have come across any such issues, please do share the findings. For the other 2 queries, I do have an interesting fact about Cost Based Optimizer, but unfortunately, have not collected artifacts and data to post on the blog. While the change in BLINE_INDX Index suggested by me has resolved these issues as well, for posting purpose, I am doing further study on this and will share the findings as and when I am ready with it.

About these ads

About Vivek Sharma
I am an Oracle Professional from Mumbai (India). I blog on the issues that I feel is Interesting for my readers. These are all my real life examples, which I hope, you would find interesting. Comments are always a welcome.

14 Responses to Cost Based Optimization ! Query Performance Issue after creating a new Index…

  1. Shyamakanta says:

    Superb explanation as usual.

  2. Ted Urban says:

    I had a similar issue as you had mentioned above. The dev team added a new index and some of the other queries strated to perform poorly. The first area I checked was how the index was define. I noticed that the first column was the same. upon further invesitigation,I found that Oracle looked at the first column, then the second to determine which index to use. By changing the column position of the new index, all queries started to run correctly once more. The new index also was used by the query that required such. During some testing, I played with index column position and had some interesting results when I ran a query multiple times changing the positions while another index resided with the same first column. As I read your artical, and it was good, I noticed at the top, the first column was the same in both indexes. The first though I had was to change the column position. I noticed you added another column to resolve the issue, which I did not think about. This provides me some more information as to face those problematic performance issues as apps are developed.
    Again, thanks for the artical, very detailed.

  3. Santosh says:

    Hi Vivek,

    I am new follower of your blog. Can you please explain the importance of column_position(as you have investigated above)? How does that impact CBO?


    • Vivek says:


      COLUMN_POSITION is a column from dba_indexes and gives the details of the column ordering of an Index, basically used for composite indexes. For example, if you create an Index on (B,A), then the column_position of this Index for Column B, will be 1 and A would be 2. If you create a single column index, then the column_position will be just 1. It helps you in getting the details of all the columns of an Index alongwith its position.


  4. smahadevan says:

    Hi Vivek,
    Thats a very nice article. Helped me a lot in understanding the 10053 trace. I am in a similar situation trying to tune a bad performing query. I am trying to understand why oracle chose this path. First I am trying to understand the values calculated by oracle for the good query. Later I will work on the bad query.

    Pasting part of good 10053 trace as required:
    Index Stats::
    Index: IDX_GL_RECONCILED_BALANCE_1 Col#: 2 1
    LVLS: 2 #LB: 848 #DK: 30 LB/K: 28.00 DB/K: 291.00 CLUF: 8750.00

    Best:: AccessPath: IndexRange Index: IDX_GL_RECONCILED_BALANCE_1
    Cost: 325.22 Degree: 1 Resp: 325.22 Card: 80.63 Bytes: 0


    —————————— ————

    —————————— —————–

    I am trying to deduce the value 325.22
    According to your calculation:

    select 2+ceil(848 * 1/18), ceil(8750*1/18*1/2) from dual;

    2+CEIL(848*1/18) CEIL(8750*1/18*1/2)
    —————- ——————-
    50 244

    select 2+ceil(848*1/2), ceil(8750*1/2*1/2) from dual;

    2+CEIL(848*1/2) CEIL(8750*1/2*1/2)
    ————— ——————
    426 2188

    I do not know what the calculation for the second half of your query is. But the above are my values and there are no where near the 325.22. Any ideas on this? Thanks in advance.

  5. Pingback: Blogroll Report 26/03 /2010 – 02/04/2010 « Coskan’s Approach to Oracle

  6. Abhishek says:

    Hi Vivek …

    So we are going to meet again in HYD for Sangam 10; last time Tom Kyte was excellent and now this time Jonathan …
    good going

    One more thing

    I was trying to simulate(Cost Based Optimization ! Query Performance Issue after creating a new Index‚Ķ) the same problem in my enviornment …but I am not getting the right bind variable..if you can send or upload the script that will be great.


    • Vivek says:

      Hi Abhishek,

      To get the value of bind variable, you will have to query v$sql_bind_capture. The query would be :

      column value_string for a20
      column datatype_string for a20
      column name for a10
      set lines 132
      from v$sql_bind_capture
      where sql_id=’&sql_id’;

      To get the bind variable, by way of which a query was hard parsed, you can check the plan using dbms_xplan.display_cursor with ‘allstats last’ options.

      set lines 200
      set pages 1000
      select * from table(dbms_xplan.display_cursor(‘&sql_id’,&child,’allstats last’));


  7. Ranjit Nagi says:

    Hi Vivek,

    Does it mean, optimizer always uses dba_indxes( Distinct value) to compute the cost/selectivity if all columns of index is referred in where clause? Otherwise if not all columns are referred then it would used individual column selectivity?


    • Vivek says:

      Hi Ranjit,

      Yes, in most of the cases, Optimizer uses num_distinct from dba_indexes for the index, if all the columns of that index are used in the WHERE predicate.


  8. Mohammed says:

    Hi Vivek,

    “The Best Explanation about Cost Calculation” Thanks for the details.

    I am Just reviewing second part of calculation.

    Calculation of Selectivity part..

    1/6.1026 (The value 6.1026 from 10053 trace) = 0.163864582

    0.163864582 * 10 ^ 7 = 1638645.82 ( this is value same as index distinct key of IDX3) Reverse Engineering.

    Now the formula become..

    select 3+ceil(293115*1/1638635*1/105), ceil(48089155 * 1/1638635) from dual;

    The output is

    3+ceil(293115*1/1638635*1/105) = 4

    ceil(48089155 * 1/1638635) = 30

    COST = 4 + 30 = 34 ( this is value is same as 10053 trace)


    in formula am selecting only one time 1/NDV value like ( ceil(48089155 * 1/1638635) ) but you have selected two times ( 1/ NDV * 1/ NDV)

    is that ok or need to select one more time 1 / NDV value because INDEX SELECTIVITY WITH FILTERS. ( I am not able figure out IDX3 execution plan has access both column or only or only one columns) if I select two times then out is different— ceil(48089155 * 1/1638635 * 1/105) = 1

    Please clarify.

    Thanks again.


    • Vivek says:

      Hi Mohammad,

      I missed out replying to your comment. Frankly, could not understand your question. However, if you are asking whether to you 1/NDV once or twice, it depends. In the case of an execution plan of IDX3, the access predicates shows 2 columns, but immedialtely the next line (FILTER) shows the 2nd column used as a Filter. Therefore, the calculation 1/NDV is to be considered twice.

      Do let me know whether your concern is answered. If not, please let me know.


  9. Mohammed says:

    Thanks for reply

    Just wanted to check how you achive Cost of IDX3? I mean how you achive value 34?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

Join 125 other followers