Predicting Performance Impact after adding a New Column to an existing Index

This blog featured on my previous blog URL (still unaccessible) and was published on 7th September 2008. Thought of reposting this, as a similar performance issue was recently reported by one of my customer, where a query had to be fixed by way of an Index Hint. This blog would also help you understanding the way optimizer_index_cost_adj is to be calculated for optimizing a query performance. This calculation was requested by one of a regular reader of my blogs.

Modifying an existing index to optimize multiple queries is a common practice, as this eliminates the overheads of maintaning muliple indexes, by way of INSERTS / UPDATES / DELETES. This modification of an existing index includes, re-arranging column ordering, adding a new column etc. This blog focuses on an issue that may trigger and impact the performance of your production system due to addition of a new column in an existing index.

The scenario that I am going to discuss in this blog is, “Assuming one day you are at a customer site investigating a performance issue or doing a proactive health check of a production system. During investigation, you come across two queries that are almost similar and use same Index. The only difference between the two queries is an additional Filter Predicate.” The two queries may look like :

select * from emp where sal=:b2;
select * from emp where sal=:b2 and deptno=:b1;

Both the queries use an Index on SAL column, which is a Single Column Index. We all know that the logical I/O’s of a query can be reduced if a query can make best use of an Index and scans the table, only when required. Which means, a composite index on (SAL, DEPTNO) would be efficient and should reduce the Logical I/O’s for the second query, without impacting the first query. Therefore, instead of creating a new composite index, the suggestion is put forward to modify an existing index on SAL and add DEPTNO as a second column. This will ensure that both the queries use the same index. After the alteration of this Index on production, won’t you be stunned if the customer comes back and says that the performance is still not OK, and infact, is worst than it was a day before ? This is because, while the alteration improved the performance of the second query, it impacted the performance of the first with the change in optimizer plan as it calculates the cost of an Altered Index to be costlier than Full Table Scan. Let us walk through a test case to understand the issue and the rationale behind optimizer estimation.

TEST CASE

drop table vivek_test;

exec dbms_random.seed(0);

create table vivek_test as
with test as
(select * from all_objects where rownum between 1 and 10000)
select	trunc((rownum-1)/1000) object_id,
	round(dbms_random.value(1,100),0) bad_clustered,
	a.object_name,
	a.temporary,
	a.created
from	test a,
	test b
where  rownum between 1 and 100000; 

create index vivek_test_idx on vivek_test(object_id);

exec dbms_stats.gather_table_stats(user,'VIVEK_TEST',cascade=>true);

Below are the statistics for the table and its index (in my test database) :

select owner, table_name, num_rows, blocks from dba_tables where table_name='VIVEK_TEST';

OWNER                          TABLE_NAME                       NUM_ROWS     BLOCKS
------------------------------ ------------------------------ ---------- ----------
VIVEK                          VIVEK_TEST                          99889        420

select index_name, num_rows, blevel, leaf_blocks, distinct_keys, clustering_factor
from	dba_indexes
where	table_name='VIVEK_TEST';

INDEX_NAME                       NUM_ROWS     BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR
------------------------------ ---------- ---------- ----------- ------------- -----------------
VIVEK_TEST_IDX                     100000          1         195           100               420

Now, I will run two queries (assume these to be application queries) against the table and both of these queries use the Index created on Object_ID. The queries and their Runtime Execution Plans are pasted below :

variable b1 number;
variable b2 number;
exec :b1:=0;
exec :b2:=57;
select /*+ QUERY1 */ * from vivek_test where object_id=:b1;

SQL_ID  99ss8hxuax3q2, child number 0
-------------------------------------
select /*+ QUERY1 */ * from vivek_test where object_id=:b1

Plan hash value: 3696163677

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |       |       |     7 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| VIVEK_TEST     |   981 | 24525 |     7   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | VIVEK_TEST_IDX |   990 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID"=:B1)

select /*+ QUERY2 */ * from vivek_test where object_id=:b1 and bad_clustered=:b2;

Plan hash value: 3696163677

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |       |       |     7 (100)|          |
|*  1 |  TABLE ACCESS BY INDEX ROWID| VIVEK_TEST     |    10 |   250 |     7   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | VIVEK_TEST_IDX |   990 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("BAD_CLUSTERED"=:B2)
   2 - access("OBJECT_ID"=:B1)

Based on the queries executed against this table, it is decided to drop the exisiting index on OBJECT_ID and recreate it on OBJECT_ID, BAD_CLUSTERED. This is to optimize the second query (eliminate unwanted logical I/O’s) and with an assumption that this change will not impact the first query, with QUERY1 hint.

drop index vivek_test_idx;
create index vivek_test_idx on vivek_test(object_id, bad_clustered);
exec dbms_stats.gather_table_stats(user,'VIVEK_TEST',cascade=>true);
SQL> select owner, table_name, num_rows, blocks from dba_tables where table_name='VIVEK_TEST';

OWNER                          TABLE_NAME                       NUM_ROWS     BLOCKS
------------------------------ ------------------------------ ---------- ----------
VIVEK                          VIVEK_TEST                         102477        420

Elapsed: 00:00:00.11
SQL> select index_name, num_rows, blevel, leaf_blocks, distinct_keys, clustering_factor
  2  from dba_indexes 
  3  where table_name='VIVEK_TEST';

INDEX_NAME                       NUM_ROWS     BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR
------------------------------ ---------- ---------- ----------- ------------- -----------------
VIVEK_TEST_IDX                     100000          1         237          9998             41509

On a production system, these changes are normally done during the downtime and the system is released for the users. The impact is usually visible when the users start executing these queries concurrently. Let us re-execute the queries with the change in the index implemented. We will execute the second query first (with QUERY2 hint) and then the first query and compare the runtime execution plan.

variable b1 number;
variable b2 number;
exec :b1:=0;
exec :b2:=57;

select /*+ QUERY2 */ * from vivek_test where object_id=:b1 and bad_clustered=:b2;

Plan hash value: 3696163677

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |       |       |     6 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| VIVEK_TEST     |    10 |   250 |     6   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | VIVEK_TEST_IDX |    10 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID"=:B1 AND "BAD_CLUSTERED"=:B2)

select /*+ QUERY1 */ * from vivek_test where object_id=:b1;

Plan hash value: 3275957777

--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |       |       |    97 (100)|          |
|*  1 |  TABLE ACCESS FULL| VIVEK_TEST |  1015 | 25375 |    97   (4)| 00:00:02 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("OBJECT_ID"=:B1)

While the cost of the second query, and the logical I/O’s as well, has gone down, the change has impacted the first query, that was working absolutely fine before the change. This can lead to a greater impact when these are run concurrently. To check for the impact, I simulated concurrency by way of spawning 20 sessions, each running the impacted query. With these 20 concurrent sessions, my 2 CPU laptop choked up with 100% utilization and 95% waits on CBC Latches. The code to simulate the concurrency is pasted below with an explanation :

  1. Create a Staging table and populate the object id’s in random order to be queried.
  2. The staging table holds 20000 object id’s with 10 parallel streams (PSTREAMS).
  3. The procedure will be scheduled from dbms_job, with pstreams as an Input parameter (10 Streams).
drop table staging;

create table staging as
select object_id, mod(rownum,10) pstreams  
from vivek_test where rownum between 1 and 20000
order by dbms_random.random;

create or replace procedure process_job(ps in number) as
l_cnt	number;
begin
for i in (select object_id from staging where pstreams=ps)
loop
select count(distinct temporary) into l_cnt from vivek_test
where	object_id=i.object_id;
end loop;
end;
/


variable jobno number;
variable hrs number;
variable mns number;
exec :hrs:=14;
exec :mns:=35;

begin
for i in 1..10
loop
BEGIN
     DBMS_JOB.SUBMIT(:jobno,'process_job('||i||');', 
         trunc(sysdate)+:hrs/24+:mns/(24*60));
     COMMIT;
END;
end loop;
end;
/

What could have triggered the change in plan ?

Before getting into the cause of this change, let us execute the query with an Index Hint. This will help us investigate the issue.

select /*+ INDEX(VIVEK_TEST,VIVEK_TEST_IDX) */ * from vivek_test where object_id=:b1;

Plan hash value: 3696163677

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |       |       |   420 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| VIVEK_TEST     |  1015 | 25375 |   420   (0)| 00:00:06 |
|*  2 |   INDEX RANGE SCAN          | VIVEK_TEST_IDX |  1015 |       |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID"=:B1)

With an Index Hint, the cost of step TABLE ACCESS BY INDEX ROWID step is 420 as against 97 for the unhinted query (TABLE SCAN FULL). Optimizer, while generating an optimal execution plan, compares cost of each available access path and selects the one with least cost. In this case, the cost of a TABLE SCAN FULL was assumed to be cheaper than the cost of an INDEX SCAN. The Index can be considered costlier due to two reasons : by way of Cost and / or, due to huge number of Logical I/O’s. TABLE ACCESS BY INDEX ROWID is always followed by INDEX RANGE SCAN, and as can be seen from the plan, the cost of an INDEX SCAN is mere 4, while the cost of TABLE ACCESS step is 420. While the former is calculated from BLEVEL and LEAF_BLOCKS, the later is calculated based on CLUSTERING_FACTOR (CF).

Compare the CF of the Index, before and after change. Before the change, CF was 420 (near to the num_blocks of the table), after the change it changed drastically to 41509. This was due to the bad clustering of the additional column bad_clustered and this change was enough to blow up the cost. The cost calculation before and after the change is as under :

BEFORE THE CHANGE
Index Cost = BLEVEL + CEIL(LEAF_BLOCKS * SELECTIVITY)
Table Scan = CEIL(CF * SELECTIVITY)
Total Cost of TABLE ACCESS BY INDEX ROWID = Index Cost + Table Scan

While BLEVEL is considered for Index Cost Calculation, optimizer ignores this if the BLEVEL of an Index is 1 and all the columns of an Index are used as an EQUALITY Predicate in the query. Therefore, in our case too, since the columns are used as an Equality predicate and the blevel is 1, this value will be ignored.

SQL> select ceil(195*1/100), ceil(420*1/100) from dual;

CEIL(195*1/100) CEIL(420*1/100)
--------------- ---------------
              2               5

The cost of the first query, before the change is 7. Now, after the change :

SQL> select 1+ceil(237*1/100), ceil(41509*1/100) from dual;

1+CEIL(237*1/100) CEIL(41509*1/100)
----------------- -----------------
                4               416

With Index Hint, the cost is 420 and this is costlier than 97. There are two solutions for this issue. These are :

  1. Chnage the value for OPTIMIZER_INDEX_COST_ADJ (OICA) and set this globally.
  2. Fix the Query with an Index Hint.

The first change (i.e.OICA), would require application testing as setting this globally can impact other queries. The second change was a feasible solution and could be applied without impacting other queries. Let us visit the formula to calculate the value of OICA.

OPTIMIZER_INDEX_COST_ADJ

This is calculated as COST OF A FTS divided by TABLE ACCESS BY INDEX ROWID. In our case, the cost of FTS is 97 and Table Access by Index Rowid is 420. Therefore 97/420*100 is 23.09. This means, if the value of OICA is set to less than 23, the query should use an Index, without modification.

SQL> alter session set optimizer_index_cost_adj=22;

Session altered.
select /*+ OICA_22 */ * from vivek_test where object_id=:b1

Plan hash value: 3696163677

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |       |       |    92 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| VIVEK_TEST     |  1015 | 25375 |    92   (0)| 00:00:02 |
|*  2 |   INDEX RANGE SCAN          | VIVEK_TEST_IDX |  1015 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID"=:B1)

As mentioned earlier, a better option would be to hint the query. Once the query is modified with an Index Hint, I spawned the Concurrency jobs and could see that gain, with less than 5% waits on CBC latches, as against, more than 95% and 100% CPU Utilization.

How do we predict the impact of this change before implementing it ?

Before making these changes, the impact can be predicated. This prediction should help estimate the cost change, before and after. Following query can be modified and run (for a bigger table, should be run in parallel during less load period). The only change in the query will be the ordering of the columns in an Index. For this, I will recreate the Index on a Single Column and then execute the query to check for the Change in the CF after adding another column.

SQL> drop index vivek_test_idx;

Index dropped.

Elapsed: 00:00:00.07
SQL> create index vivek_test_idx on vivek_test(object_id);

SQL> select  sum(block_change) from (
  2  select  block_fno, object_id, bad_clustered, prev_bfno,        
  3   (case when nvl(prev_bfno,0)!=block_fno then 1 else 0 end) block_change from (
  4  select  block_fno, object_id, bad_clustered, lag(block_fno) over (order by object_id) prev_bfno from (
  5  select    dbms_rowid.rowid_block_number(rowid)||'.'||
  6    dbms_rowid.ROWID_TO_ABSOLUTE_FNO(rowid,'VIVEK','VIVEK_TEST') block_fno, 
  7    object_id, bad_clustered
  8  from  VIVEK_TEST
  9  order by object_id, bad_clustered, block_fno)));

SUM(BLOCK_CHANGE)
-----------------
            41509

The output of this query should approximately match the CF calculated by the Optimizer and should be enough to proactively arrive at the impact due to the change before making the change.

ppi.sql

Advertisements

About Vivek Sharma
I am an Oracle Professional from Mumbai (India). I blog on the issues that I feel is Interesting for my readers. Some of these are my real life examples, which I hope, you would find interesting. Comments are always a welcome. The Technical Observations & Views here are my own and not necessarily those of Oracle or its affiliates. These are purely based on my understandings, learnings and resolutions of various customer issues.

4 Responses to Predicting Performance Impact after adding a New Column to an existing Index

  1. Anand says:

    Hi Vivek,

    It was a nice learning for me.Thanks for it and hope to read more such good articles ahead.

    You said “a better option would be to hint the query. Once the query is modified with an Index Hint, I spawned the Concurrency jobs and could see that gain, with less than 5% waits on CBC latches, as against, more than 95% and 100% CPU Utilization.”

    How did the Index hint help when the cost was going high??Did you set the value for OPTIMIZER_INDEX_COST_ADJ??

    Like

    • Vivek says:

      Anand,

      The cost of the query was high due to the high value of Clustering Factor. Optimizer Assumed that for each access to the table, it will have to visit a new block, which means additional I/O. We knew that this (single column) query was efficiently using an index previously, therefore, hinting would be a nice idea as compared to Optimizer_index_cost_adj. Without hint, the optimizer does not use the index as the cost of Index Scan is high and it opts for a cheaper plan, which is FTS. With hint, it is passed as a directive and optimizer obeys this, without considering the cost. There was no need to set optimizer_index_cost_adj with hint in place.

      Hope this clarifies your point.

      Regards
      Vivek

      Like

      • Anand says:

        Hi Vivek,

        Thanks for the prompt reply.Yes definitely using hint is better compared to Optimizer_index_cost_adj.I think i was not clear in my previous question.My point is when you made a composite index the query went for a full table scan(VIVEK_TEST) with cost of 97.But when index hint was used the TABLE ACCESS BY INDEX ROWID cost was 420.As the cost is increased, isn’t it bad?

        Regards,
        Anand

        Like

      • Vivek says:

        Hi Anand,

        Sorry for the delay in replying back. I was out of action for last few days.

        About the cost, YES, with the composite index, the cost of an Index Scan is high and this is the reason optimizer opted for a cheaper plan, which is a Full Table Scan. This was optimizer’s perpective that was based on Clustering_factor (CF). Unfortunately, the CF of an Index is calculated for all the columns in an Index. While the CF of the Index seemed to be BAD, this was due to the additional column (BAD_CLUSTERED) appended to the index. Without this additional column, the CF for a single column index on OBJECT_ID was good and optimizer is not aware of this fact. Therefore, even if a query on this single column is executed (check the first test case), it takes into account the CF for an entire index, and for the optimizer it was bad, but as for the performance, index scan was good. In such cases, forcing an Index hint or setting up optimizer_index_cost_adj (OICA) is the only viable solution.

        Infact, before getting into these issues, I was against HINT or OICA. But when there are issues with optimizer calculation, HINT’s or OICA are definetely required and this has changed my perspective on HINT’s.

        I strongly feel that the way optimizer gathers column level statistics, like num_distinct, num_nulls etc., an additional column of Clustering_factor should also be added to dba_tab_columns, that calculates the CF for all the individual columns. This intelligence can then be used for the queries that are executed on single column and use a Composite Index.

        I hope your query is answered. Do let me know if you need any more clarification.

        Regards
        Vivek

        Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s