Cost Based Optimization ! Query Performance Issue after creating a new Index…

Creating an Index to optimize performance of a Query is usually an adopted strategy by most of the Database Administrators or Application Vendors. While it may optimize the performance of the query for which it has been created, but can have an impact on other queries, as these may start picking up the newly created index and this new index may not be optimal for these queries. This is especially true where Indexing Strategy is not up to the mark. My first AIOUG Session at Hyderabad, I presented something on Proactive and Reactive Tuning. If Optimization is ignored during Application Development Life cycle, then these efforts are done in an reactive mode and this is when the database administrators or developers end up creating unwanted or more number of Indexes. With these redundant indexes in place, there is always a threat of performance impact after creating a new index and this is what transpired at one of my customer place.

Recently, the application team at one of my customer site introduced a new Batch Process that runs 2-3 times a day. After Implementation, they found the performance of this process not up-to-the mark and therefore created a new Index to optimize one of the resource intensive query of that process. After creating this new Index, the performance of the batch process improved, but few other queries started using this newly created index, with the sessions waiting on “read by other session”. Now, the customer was in a big dilemma. If they drop the new index, the batch process is impacted and if it is created, 2-3 other queries are impacted. When this was escalated to the Application Vendor, they washed off their hands stating this to be an Oracle Issue. In my opinion, this seems to be an application issue and as you read further you will come to know the reason for this, but the issue can also be attributed to Oracle Optimizer. Why the Optimizer chose a newly created index, when the performance of the query was optimal with the existing index ? Let us walk through a detailed issue and the analysis.

After Implementing a new Batch process, Query No.1 below was doing a Full Table Scan of a 80 GB table. To optimize this query and improve the completion time of the Batch process, an Index BLINE_INDX was created. There was already an Index N_T_ADVICE_DTL_IDX3 on the CASEID column as a leading column, but the optimizer chose not to use this Index, may be since this column has been used as a Range Predicate. Instead of Investigation the reason for not using this Index, the Application Vendor reverted with a new index BLINE_INDX and therefore this index was created. Once this new Index was created, the execution plan of Query No.2, which was optimally using N_T_ADVICE_DTL_IDX3 ceased to use this index and started using the New Index BLINE_INDX. While there were several more indexes on this table, I have considered only two indexes that are relevant for this posting.

Query No.1 (BLINE PROCESS)
--------------------------

SELECT ROWID,CASEID,TO_CHAR(ADVICEDATE,'YYYYMMDD'),TXNADVICEID,CHARGEID,TXNID,
TXNTYPE,ADVICETYPE,ADVICEAMT,ORIGNALAMT,STATUS,DESCRIPTION,REMARKS,BPTYPE 
FROM 	flea.N_T_ADVICE_DTL 
WHERE 	caseid BETWEEN :1 AND :2 
AND 	ADVICEDATE BETWEEN :3 AND :4 
AND 	status='A' 
ORDER BY CASEID, ADVICEDATE,  TXNADVICEID;

Query No.2
----------

SELECT SUM(txnadjustedamt) 
FROM	flea.n_t_advice_dtl 
WHERE	caseid = :b1 
AND	chargeid =93 
AND	status ='A';

Index Details
-------------

INDEX_NAME                     COLUMN_NAME                    COLUMN_POSITION
------------------------------ ------------------------------ ---------------
BLINE_INDX                     CASEID                                       1 <== Local
                               ADVICEDATE                                   2
                               STATUS                                       3

N_T_ADVICE_DTL_IDX3            CASEID                                       1 <== Local
                               CHARGEID                                     2

My services in this issue was sought when the Customer wanted the analysis on why Query 2 started picking up a new Index, while, both the columns of IDX3 index are present in Query 2 and therefore should be an obvious choice for the optimizer. This seems to be true and therefore, I started my analysis on this issue with a query to v$sql_bind_capture to get the actual production runtime value passed to the bind :1 by the application user. I executed this query and passed the bind value to a bind variable b1. The runtime plan of the query shows BLINE_INDX index being used and this matches the production issue.

SQL>@bind_capture
new   2: where sql_id='b7nkg6m692jgq'

NAME                             POSITION DATATYPE_STRING      VALUE_STRING
------------------------------ ---------- -------------------- ------------------------------
:1                                      1 NUMBER               2568494

SQL>variable b1 number;
SQL>exec :b1:=2568494;

SELECT /*+ VIVEK_2568494 */ SUM(txnadjustedamt) 
FROM	flea.n_t_advice_dtl 
WHERE	caseid = :b1 
AND	chargeid =93 
AND	status ='A';

Plan Table
============
------------------------------------------------------------------+-----------------------------------+---------------+
| Id  | Operation                            | Name               | Rows  | Bytes | Cost  | Time      | Pstart| Pstop |
------------------------------------------------------------------+-----------------------------------+---------------+
| 0   | SELECT STATEMENT                     |                    |       |       |     4 |           |       |       |
| 1   |  SORT AGGREGATE                      |                    |     1 |    16 |       |           |       |       |
| 2   |   PARTITION RANGE SINGLE             |                    |     1 |    16 |     4 |  00:00:01 | KEY   | KEY   |
| 3   |    TABLE ACCESS BY LOCAL INDEX ROWID | N_T_ADVICE_DTL     |     1 |    16 |     4 |  00:00:01 | KEY   | KEY   |
| 4   |     INDEX RANGE SCAN                 | BLINE_INDX         |    47 |       |     1 |  00:00:01 | KEY   | KEY   |
------------------------------------------------------------------+-----------------------------------+---------------+
Predicate Information:
----------------------
3 - filter("CHARGEID"=:SYS_B_0)
4 - access("CASEID"=:B1 AND "STATUS"=:SYS_B_1)
4 - filter("STATUS"=:SYS_B_1)

This being a partitioned table, the value 2568494 falls under the high_value of Partition 3 of the table. Therefore, the statistics for this partition was referred for the calculation and analysis. Shown below are the Optimizer Statistics for Partition 3 and the calculation that optimizer does to derive a cost of an Index Scan.


## Calculation for Partition 3
-------------------------------
TABLE_OWNER  PARTITION_NAME           HIGH_VALUE      NUM_ROWS     BLOCKS PARTITION_POSITION
------------ ------------------------ ------------- ---------- ---------- ------------------
FLEA         N_T_ADVICE_DTL_FO_P3     3000000         90588615    2019809                  3

Partition Column Stats
-----------------------
## Stats for Partition 3 based on Bind Capture : 2568494

COLUMN_NAME     NUM_DISTINCT  NUM_NULLS    DENSITY HISTOGRAM
--------------- ------------ ---------- ---------- ---------------
CASEID                954287          0 1.0479E-06 NONE
CHARGEID                 105          0  .00952381 NONE
STATUS                     2          0         .5 NONE

Partition Index Stats
---------------------
INDEX_NAME                PARTITION_NAME                     BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR   AVDPK      ALBKP
------------------------- ------------------------------ ---------- ----------- ------------- ----------------- ------- ----------
BLINE_INDX                N_T_ADVICE_DTL_FO_P3                    3      335600       8818906          46069060       5          1

N_T_ADVICE_DTL_IDX3       N_T_ADVICE_DTL_FO_P3                    3      293115       1638635          48089155      29          1

Once the data was collected, it was time to compute the cost of both the Indexes. The formula for computing the cost of an Index Scan and then Table Access by Index step is :

Cost of Index Scan (A) = BLEVEL + CEIL(LEAF_BLOCKS * INDEX SELECTIVITY)
Cost of Table Scan via Index (B) = CEIL(CLUSTERING_FACOR * INDEX SELECTIVITY WITH FILTERS)
Total Cost = A + B

Analysis & Cost Calculation

In case of BLINE_INDX Index, for the selectivity, only CASEID column is to be considered. Revisit the Execution Plan above and check for the Predicate Information. While CASEID and STATUS columns exists in this Index, Access_Predicate Information shows that only CASEID column is used for Index Scan and further STATUS column is used as a Filter_predicate. Therefore, Num_Distinct of CASEID has to be used to derive Index Selectivity and product of Num_Distinct of CASEID and STATUS has to be used to compute the Index Selectivity with Filters.

Similary, for N_T_ADVICE_DTL_IDX3, since both the columns present in the Index are used in the Query, the Selectivity (Index and with Filters) will have to be computed on the product of the two columns present i.e.CASEID and CHARGEID.

One point also to be noted here is that the parameter OPTIMIZER_INDEX_COST_ADJ has been set to 15 and this is also to be taken into consideration when computing the Final Cost.

Calculation For bline Index
---------------------------
select 3+ceil(335600*1/954287), ceil(46069060*1/954287*1/2) from dual;

Index Cost : 4
Table Access : 4+25 = 29

OICA = 29*15/100 = 4.35

Rounded Up to 4
----------------

For IDX3 Index
--------------

select 3+ceil(293115*1/954287*1/105), ceil(48089155*1/954287*1/105) from dual;

Index Cost : 4
Table Access : 4+1 = 5

OICA : 5*15/100 = 0.75

Rounded Up to 1
---------------

Based on this calculation, the cost of IDX3 seems to be cheaper and should have been an obvious choice. Why is Optimizer computing BLINE_INDX to be cheaper than IDX3 ? There’s another better place to compare our analysis with the Optimizer Computation and this is 10053 trace file. For this, the query needs to be hard parsed and therefore, with a minor change in the query, it was reexecuted for the same bind value and a trace file was generated.

Once the 10053 trace was generated, it was easier to get into the rationale behind Optimizer Choice of BLINE_INDX index for this query. Pasted below is the relevant content of the 10053 trace file. I have also underlined the relevant value that will be used for analysis and comparison.

Table Stats::
  Table: N_T_ADVICE_DTL  Alias: N_T_ADVICE_DTL  Partition [2]
    #Rows: 90588615  #Blks:  2019809  AvgRowLen:  164.00
Index Stats::
  Index: BLINE_INDX  Col#: 28 27 17  PARTITION [2]
    LVLS: 3  #LB: 335600  #DK: 8818906  LB/K: 1.00  DB/K: 5.00  CLUF: 46069060.00
  Index: N_T_ADVICE_DTL_IDX3  Col#: 28 3  PARTITION [2]
    LVLS: 3  #LB: 293115  #DK: 1638635  LB/K: 1.00  DB/K: 29.00  CLUF: 48089155.00
***************************************
SINGLE TABLE ACCESS PATH
  Column (#28): CASEID(NUMBER)  Part#: 2
    AvgLen: 6.00 NDV: 954287 Nulls: 0 Density: 1.0479e-06 Min: 2000000 Max: 2999998
  Column (#3): CHARGEID(NUMBER)  Part#: 2
    AvgLen: 4.00 NDV: 105 Nulls: 0 Density: 0.0095238 Min: 2 Max: 1000018
  Column (#17): STATUS(VARCHAR2)  Part#: 2
    AvgLen: 2.00 NDV: 2 Nulls: 0 Density: 0.5
  Table: N_T_ADVICE_DTL  Alias: N_T_ADVICE_DTL
    Card: Original: 90588615  Rounded: 1  Computed: 0.45  Non Adjusted: 0.45
  Access Path: TableScan
    Cost:  441312.32  Resp: 441312.32  Degree: 0
      Cost_io: 425225.00  Cost_cpu: 81419528497
      Resp_io: 425225.00  Resp_cpu: 81419528497
kkofmx: index filter:"N_T_ADVICE_DTL"."STATUS"=:B1
kkofmx: index filter:"N_T_ADVICE_DTL"."CHARGEID"=:B1 AND "N_T_ADVICE_DTL"."STATUS"=:B2
kkofmx: index filter:"N_T_ADVICE_DTL"."CASEID"=:B1 AND "N_T_ADVICE_DTL"."CHARGEID"=:B2 AND "N_T_ADVICE_DTL"."STATUS"=:B3
  Access Path: index (RangeScan)
    Index: BLINE_INDX
    resc_io: 29.00  resc_cpu: 195479
    ix_sel: 1.0703e-06  ix_sel_with_filters: 5.3516e-07
    Cost: 4.36  Resp: 4.36  Degree: 1
  Access Path: index (AllEqRange)
    Index: N_T_ADVICE_DTL_IDX3
    resc_io: 34.00  resc_cpu: 216124
    ix_sel: 6.1026e-07  ix_sel_with_filters: 6.1026e-07
    Cost: 5.11  Resp: 5.11  Degree: 1
  Best:: AccessPath: IndexRange  Index: BLINE_INDX
         Cost: 4.36  Degree: 1  Resp: 4.36  Card: 0.45  Bytes: 0

The Cost of BLINE_INDX (resc_io) matches our calculation. RESC_IO for BLINE_INDX is 29 and with OICA set to 15, the final cost is 4.36, rounded off to 4. Surprisingly, the Cost of IDX3 (resc_io), as per Optimizer is 34, as against our calculation of 5. Optimizer always chose a least costed plan and in this case, 29 against 34 and 4.36 agains 5.11 is cheaper, therefore Optimizer opted for BLINE_INDX Index. Now, the action plan was to check for the Calculation made by the Optimizer that derives the cost of IDX3 index to be 34. BLEVEL and CLUSTERING_FACTOR of the IDX3 index is unchanged and therefore, I doubted on the Selectivity part. The ix_sel (index_selectivity) and ix_sel_with_filters (Index Selectivity with Filters) for IDX3 index is 6.1026e-07. The Selectivity, as derived by me, was 1/NDV*1/NDV (for two columns). I decided to do some reverse engineering and therefore, calculated the value of 1/6.1026e-07 which is 1638645.82. Then I looked for a value closer to 1638645.92 and observed that the DISTINCT_KEYS of IDX3 is 1638635, which is very much close. Now, if you check the Index and Index with Filters Selectivity based on 1/1638635, then it exactly matches the ix_sel and ix_sel_with_filters value of 10053 trace. Further, note the line Access Path: index (AllEqRange) above Index: N_T_ADVICE_DTL_IDX3 in the trace file. AllEqRange means, all the columns of an Index has been used as an Equality predicate and therefore, instead of using individual selectivity of each columns, optimizer has considered the computed DISTINCT_KEYS of an Index and this has flawed the Optimizer Computation.

By this time, it was clear and evident that the flaw in this calculation is due to AllEqRange, it was time to implement a change that would resolve the issue. There were two solutions for this issue. These are :

  1. Modify the Clustering Factor for the problematic partitions.
  2. Add a new Column to IDX3 Index that is not used in the Problematic Query (query no.2)

I computed the cost for each of the partitions and observed that the clustering_factor for almost 70% of the index partitions will have to be modified and this means, re-calculating this everytime the stats are computed. Locking of Stats after modifying the Clustering_factor would have helped but then, Optimizer Behaviour Change and the risk of performance impact with other queries was on top of the mind and therefore, this option was not implemented. I tested this for one partition, where after modifying the Clustering_factor, the optimizer started picking up IDX3 index.

The other option was to modify the index by way of adding a new column. Again, in order to eliminate any potential risk after adding a third column, the best option implemented thought of and implemented was adding a column with single distinct value. Therefore, a column CURRENCYID with a single distinct value was appended to this index and was tested on UAT. Before making this change on the UAT, first we made sure that the issue is reproduced. Once we were sure that the similar issue exists on the UAT enviroment as well, we dropped and recreated the index on (CASEID, CHARGEID, CURRENCYID) and the change had a dramatic impact. With this change, the AllEqRange changed to RangeScan and the optimizer computed the cost based on Individual Selectivities of the two columns rather than DISTINCT_KEYS from dba_indexes. Pasted below is the relevant portion of 10053 trace generated on UAT post-index change.

Table Stats::
  Table: N_T_ADVICE_DTL  Alias: N_T_ADVICE_DTL Partition [2]
    #Rows: 25494530  #Blks:  10874  AvgRowLen:  165.00
Index Stats::
  Index: BLINE_INDX  Col#: 28 27 17  PARTITION [2]
    LVLS: 2  #LB: 92720  #DK: 8843108  LB/K: 1.00  DB/K: 1.00  CLUF: 11966280.00
  Index: N_T_ADVICE_DTL_IDX3  Col#: 28 3 18 PARTITION [2]
    LVLS: 2  #LB: 86050  #DK: 2764430  LB/K: 1.00  DB/K: 4.00  CLUF: 12732735.00
***************************************
SINGLE TABLE ACCESS PATH
  Column (#28): CASEID(NUMBER)
    AvgLen: 6.00 NDV: 406027 Nulls: 240 Density: 2.4629e-06 Min: 0 Max: 51734139
  Column (#3): CHARGEID(NUMBER)
    AvgLen: 4.00 NDV: 165 Nulls: 0 Density: 0.0060606 Min: 2 Max: 10000048
  Column (#17): STATUS(VARCHAR2)
    AvgLen: 2.00 NDV: 3 Nulls: 0 Density: 0.33333
  Table: N_T_ADVICE_DTL  Alias: N_T_ADVICE_DTL     
    Card: Original: 25494530  Rounded: 1  Computed: 0.13  Non Adjusted: 0.13
  Access Path: TableScan
    Cost:  2544.46  Resp: 2544.46  Degree: 0
      Cost_io: 2380.00  Cost_cpu: 414356148
      Resp_io: 2380.00  Resp_cpu: 414356148
kkofmx: index filter:"N_T_ADVICE_DTL"."STATUS"=:B1
kkofmx: index filter:"N_T_ADVICE_DTL"."CHARGEID"=:B1 AND "N_T_ADVICE_DTL"."STATUS"=:B2
kkofmx: index filter:"N_T_ADVICE_DTL"."CASEID"=:B1 AND "N_T_ADVICE_DTL"."CHARGEID"=:B2 AND "N_T_ADVICE_DTL"."STATUS"=:B3
  Access Path: index (RangeScan)
    Index: BLINE_INDX
    resc_io: 14.00  resc_cpu: 127837
    ix_sel: 2.5335e-06  ix_sel_with_filters: 8.4450e-07
    Cost: 2.11  Resp: 2.11  Degree: 1
  Access Path: index (RangeScan)
    Index: N_T_ADVICE_DTL_IDX3
    resc_io: 4.00  resc_cpu: 29395
    ix_sel: 1.4926e-08  ix_sel_with_filters: 1.4926e-08
    Cost: 1.00  Resp: 1.00  Degree: 1
  Best:: AccessPath: IndexRange  Index: N_T_ADVICE_DTL_IDX3
         Cost: 1.00  Degree: 1  Resp: 1.00  Card: 0.13  Bytes: 0

 
============
Plan Table
============
-----------------------------------------------------------------------+-----------------------------------+---------------+
| Id  | Operation                            | Name                    | Rows  | Bytes | Cost  | Time      | Pstart| Pstop |
-----------------------------------------------------------------------+-----------------------------------+---------------+
| 0   | SELECT STATEMENT                     |                         |       |       |     1 |           |       |       |
| 1   |  SORT AGGREGATE                      |                         |     1 |    15 |       |           |       |       |
| 2   |   PARTITION RANGE SINGLE             |                         |     1 |    15 |     1 |  00:00:01 | KEY   | KEY   |
| 3   |    TABLE ACCESS BY LOCAL INDEX ROWID | N_T_ADVICE_DTL          |     1 |    15 |     1 |  00:00:01 | KEY   | KEY   |
| 4   |     INDEX RANGE SCAN                 | N_T_ADVICE_DTL_IDX3     |     1 |       |     1 |  00:00:01 | KEY   | KEY   |
-----------------------------------------------------------------------+-----------------------------------+---------------+
Predicate Information:
----------------------
3 - filter("STATUS"=:SYS_B_1)
4 - access("CASEID"=:B1 AND "CHARGEID"=:SYS_B_0)

The change is still pending for Implementation on production and will be done post month end. With the analysis, it seems that the issue should resolve post this change. If you have come across any such issues, please do share the findings. For the other 2 queries, I do have an interesting fact about Cost Based Optimizer, but unfortunately, have not collected artifacts and data to post on the blog. While the change in BLINE_INDX Index suggested by me has resolved these issues as well, for posting purpose, I am doing further study on this and will share the findings as and when I am ready with it.

Advertisements

Consistent Gets Myth…..

Based on the request from some of my regular readers, this is a reposting of my previous blog on viveklsharma.blogspot.com (at present unaccessible).

Any Query executed against a database gets the required rows, which are then fetched to the end user. Everyone is aware with the fact that a table row(s) is/are stored in Oracle Database block(s), hence, when oracle has to get even a single row, a block containing that row has to be read into the buffer cache. From this, it is quite evident that Oracle reads a block at a minimum. Also, as an Optimization Feature, for every query, blocks are fetched to a user from the buffer cache (Parallel Queries are exception). Hence, if a block is not found in the buffer cache, these are reads from the disk into the cache and then given to the user.

In Oracle terminology, a read from disk is termed as a Physical Read or a Disk Reads and, a read from Cache is termed as a Consistent Get or a Logical I/O. As mentioned in my previous paragraph, if a block is not found in the Cache, it is first read from the disk into the cache and fetched to the user. This also means that every logical read includes the count of disk reads.

In this blog, I am going to write on a myth about the calculation assumed and derived by the DBA’s from the value of Consistent Gets.

At one customer site, we were discussing about Application tuning wherein a dba gave a list of Top Queries to the Vendor. The list contained the queries alongwith the statistics called Total I/O in GB and the values in this column was in the range of hundreds of GB to tens of TB. I was surprised to see these values and after the discussion, out of the curiosity, asked the dba about the values calculated in this column. The dba immediately gave me the access to his query that generated the output. The calculation in this column was (buffer_gets*db_block_size)/(1024*1024*1024). It seems that the dba’s assumed that each logical I/O to be a read to a new block and hence, these high values. Hence, if a query does a logical I/O of 1549087, then based on the calculation, the dba’s assumed that a query has read 11 gb (assuming 8k block size) worth of data into the cache. This is not true.

BLOCKS READ ALGORITHM

A Concepual Knowledge on Oracle Blocks Read Algorithm is as under :

  1. User Issues a Query.
  2. Query is Parsed and Optimal Execution path is generated and stored in the Library Cache.
  3. Based on the Execution Path, required Index or Table block is searched in the the Cache.
  4. If the block is not found in the cache, a disk read request is made, and the block is read into the Cache.
  5. The block, from the Cache is then read into a private memory area (UGA) of the User.
  6. Once the Block is read into the private memory of the User, the required row is fetched.

The value of Consistent Read is incremented each time a new block is read into a private memory area of the User, which also means that, a single block, if read multiple times, affect the value of the Consistent Read. While an in-efficient SQL contributes largely to Consistent Read, but one factor that also have significant impact is ARRAY SIZE. I am not going to discuss about In-efficient Queries, but will write on the calculation that undergoes and impact Consistent Gets.

A better way to understand the calculation is by way of an Example and as always, I will run through some of the simple queries and see the impact of the change in ARRAYSIZE to check its impact on Consistent Gets.

I will create a table with 3000 rows and will run a simple query in SQLPLUS prompt. Without an Index on this table, it will be a full table scan.

SQL> create table test_cg  as select * from all_objects where rownum between 1 and 3000;

Table created.

Elapsed: 00:00:01.82

SQL> exec dbms_stats.gather_table_stats(user,'TEST_CG');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.46

SQL> @table_stats TEST_CG
old   2: where table_name='&1'
new   2: where table_name='TEST_CG'

OWNER                          PAR   NUM_ROWS     BLOCKS LAST_ANAL GLO
------------------------------ --- ---------- ---------- --------- ---
VIVEK                          NO        3000         38 02-MAR-10 YES

1 row selected.

SQL> set autot trace
SQL> select * from test_cg;

3000 rows selected.

Elapsed: 00:00:00.17

Execution Plan
----------------------------------------------------------
Plan hash value: 2626677675

-----------------------------------------------------------------------------
| Id  | Operation         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |         |  3000 |   249K|    12   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| TEST_CG |  3000 |   249K|    12   (0)| 00:00:01 |
-----------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        238  consistent gets
          0  physical reads
          0  redo size
     300791  bytes sent via SQL*Net to client
       2608  bytes received via SQL*Net from client
        201  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       3000  rows processed

A simple query with 38 blocks accounted for 238 consistent gets. Does it mean 238*8192 i.e.1.8 MB of data read into the cache ? The answer is plain NO. As mentioned earlier, the blocks are read only once into the cache and is sent to the PGA of the user that require this block. Internally, the calculation for consistent gets is (NUM_ROWS / ARRAYSIZE)+NUM_BLOCKS. In my case, when I executed this query, the arraysize was unchanged and therefore was default, which is 15.

Rows in my Table	: 3000
Blks in my Table	: 38
Default Arraysize	: 15

(3000/15)+38 = 200 + 38 = 238

The calculation matches the Consistent Gets. One value worth discussing here is the number 200 derived from (3000/15). This value means that each of the blocks will be touched multiple times. There are 38 blocks in the table, each of these 38 blocks will be read only once in the cache, but will be touched more than once and therefore, the consistent gets for a single block access will based on the number of times, it will be visited to read all the required rows. Incidently, another statistics that matches our calculation is the “SQL*Net roundtrips to/from client” from the Autotrace output above. The value, with arrays 15 is 201, which means, 200 visits or touch to the block to fetch 3000 rows, and last visit required to check and confirm whether more rows are to be fetched.

Back to our example, let us get into more details of the calculation of Consistent Gets. The table has 3000 rows and 38 blocks. The count of rows in each of these 38 blocks is shown below.

select dbms_rowid.ROWID_BLOCK_NUMBER(rowid) blkno, count(*) cnt
from test_cg
group by dbms_rowid.ROWID_BLOCK_NUMBER(rowid) order by 1;
SQL> /

     BLKNO        CNT
---------- ----------
       641         88
       642         84
       643         81
       644         76
       645         81
       646         80
       647         82
       648         77
.... some lines deleted
.... some lines deleted
       677         78
       678         82
           ----------
sum              3000

The Sequence of Consistent Gets calculation is as under (with default arraysize) :

  1. Get Block 641 in the PGA, fetch 15 Rows – Consistent Gets = 1
  2. Get Block 641 again in the PGA, fetch another 15 rows – Consistent Gets = 2
  3. Get Block 641 again in the PGA, fetch another 15 rows – Consistent Gets = 3
  4. Get Block 641 again in the PGA, fetch another 15 rows – Consistent Gets = 4
  5. Get Block 641 again in the PGA, fetch another 15 rows – Consistent Gets = 5
  6. Get Block 641 again in the PGA, fetch another 13 rows – Consistent Gets = 6
  7. Get Block 642 in the PGA, fetch 2 rows – Consistent Gets = 7
  8. Get Block 642 again in the PGA, fetch another 15 rows – Consistent Gets = 8
  9. and so on….

It is clear from this explanation that a single block is read is multiple times, in our case at at average 6 times and therefore, consistent gets for each of these blocks was around 6 to 7. Run this test case with different arraysize and the calculation should match that shown in this blog. Let us run the query with different values of Arraysize and validate the calculation.

ArraySize = 35 Consistent Gets will be (3000/35)+38 = 86+38 = 124
ArraySize = 40 Consistent Gets will be (3000/40)+38 = 75+38 = 113
ArraySize = 50 Consistent Gets will be (3000/50)+38 = 60+38 = 98

SQL> set arrays 35
SQL>select * from test_cg;

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        124  consistent gets
         87  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       3000  rows processed

SQL> set arrays 40
SQL> select * from test_cg;

3000 rows selected.

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        113  consistent gets
         76  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       3000  rows processed

SQL> set arrays 50
SQL> select * from test_cg;

3000 rows selected.

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         98  consistent gets
         61  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       3000  rows processed

Below is the query, that can be executed to check for the approximate value of the Consistent Gets and should help you understand the way this value is calculated. The value of bind variable is to be set to the ARRAYSIZE. The Query and the details of the columns used is as under :

variable b1 number;
exec :b1:=15;
SQL> compute sum of total_cnt on report
SQL> break on report

select	blkno, total_cnt, final_cnt, rows_remaining,
	case when rows_remaining=0 then touch_cnt+1 else touch_cnt end touch_cnt
from (
select	blkno, total_cnt, final_cnt, rows_remaining,
	case when total_cnt = final_cnt then ceil(final_cnt/:b1) else ceil(final_cnt/:b1)+1 end touch_cnt
from	(
select	blkno, cnt total_cnt, 
	case when rownum=1 or lag(rows_remaining) over (order by blkno)=0 
                     then cnt else (cnt-(:b1-lag(rows_remaining) over (order by blkno))) end final_cnt,
	rows_remaining 
from (
select blkno, cnt, rr, 
lead(rr) over(order by blkno) next_rr,
lead(blkno) over(order by blkno) next_blk,
ceil(rr/:b1) touch_cnt,
mod(rr,:b1) rows_remaining
from (
select dbms_rowid.ROWID_BLOCK_NUMBER(rowid) blkno, count(*) cnt, 
	sum(count(*)) over(order by dbms_rowid.ROWID_BLOCK_NUMBER(rowid)) rr 
from test_cg
group by dbms_rowid.ROWID_BLOCK_NUMBER(rowid) order by 1))));

     BLKNO  TOTAL_CNT  FINAL_CNT ROWS_REMAINING  TOUCH_CNT
---------- ---------- ---------- -------------- ----------
       641         88         88             13          6
       642         84         82              7          7
       643         81         73             13          6
       644         76         74             14          6
       645         81         80              5          7
       646         80         70             10          6
       647         82         77              2          7
       648         77         64              4          6
       649         73         62              2          6
       650         79         66              6          6
       651         79         70             10          6
       652         79         74             14          6
       653         81         80              5          7
       654         82         72             12          6
       655         77         74             14          6
       656         81         80              5          7
       657         80         70             10          6
       658         81         76              1          7
       659         78         64              4          6
       660         78         67              7          6
       661         76         68              8          6
       662         78         71             11          6
       663         78         74             14          6
       664         77         76              1          7
       665         81         67              7          6
       666         79         71             11          6
       667         79         75              0          7
       668         77         77              2          6
       669         77         64              4          6
       670         76         65              5          6
       671         80         70             10          6
       672         77         72             12          6
       673         76         73             13          6
       674         75         73             13          6
       675         79         77              2          7
       676         78         65              5          6
       677         78         68              8          6
       678         82         75              0          7
           ---------- ---------- -------------- ----------
sum              3000       2744            284        238

BLKNO		: Block Number
TOTAL_CNT	: Total Rows in the Block
FINAL_CNT	: Final Number of Rows 
		(Example Block 642 has total 84 Rows, but final row count is 82 as 13 rows were read from 641 and balance 
		2 Rows were fetched from this block, this gives 84-2=82)
ROWS_REMAINING	: Incomplete Arraysize Rows from the current block
TOUCH_CNT	: Touch Count for current block and is our Consistent Gets per Block. Final total at the end.

Consistent Gets are not a measure of number of blocks that are read into the cache but number of times, a block was read into the PGA. A block is read only once in the cache and touched multiple times. Our example above showed that with arraysize of 15, an 8k block was read once but was touched 6-7 times and hence had a 6 or 7 consistent gets per block. This does not mean 6*8192 worth of data. The data was read only once.

%d bloggers like this: