Optimizer – Part II (Cardinality – Actuals v/s Assumed)

This is in continuation to my previous post, Optimizer – Part I of this series. In Part I, we covered the mathematical formulas used by the Optimizer. In this post, we shall see these calculations in action. For this, we will create a sample table and use this through out to see optimizer behaviour. So, lets create our table TEST_SELECTIVITY from SALES table under SH Schema. It is very critical to know your data. Therefore, while creating the table, I have manipulated the data to demonstrate the behaviour against the different data distribution.

exec dbms_random.seed(0);

create table test_selectivity as
select 	a.prod_id, 
	a.cust_id,
        trunc(sysdate)-round(dbms_random.value(0,1095),0) time_id,
        a.promo_id, 
        a.quantity_sold,
        round(a.amount_sold,0) amount_sold
from 	sh.sales a
where 	rownum<=8e5;

The table has 800k rows. The columns of interest for our demonstrations are TIME_ID, which is populated with 3 years of data, PROMO_ID and AMOUNT_SOLD. Once the table is created, Optimizer Statistics are automatically gathered on the table (Oracle 11g and above). Let’s query all the relevant statistics.

select table_name, num_rows, blocks from dba_tables
where table_name='TEST_SELECTIVITY';

TABLE_NAME                       NUM_ROWS     BLOCKS
------------------------------ ---------- ----------
TEST_SELECTIVITY                   800000       3478

select 	column_name, 
	num_distinct, 
	num_nulls, 
	density,
	histogram
from	dba_tab_columns
where	owner='SCOTT'
and	table_name='TEST_SELECTIVITY'
order by 1;

COLUMN_NAME        NUM_DISTINCT  NUM_NULLS    DENSITY HISTOGRAM
------------------ ------------ ---------- ---------- --------------------
AMOUNT_SOLD                 636          0 .001572327 NONE
CUST_ID                    7056          0 .000141723 NONE
PROD_ID                      72          0 .013888889 NONE
PROMO_ID                      4          0        .25 NONE
QUANTITY_SOLD                 1          0          1 NONE
TIME_ID                    1096          0 .000912409 NONE

6 rows selected.

As mentioned earlier, for our demonstration, we will query the table on the three columns. AMOUNT_SOLD has 636 Distinct Values, PROMO_ID has 4 distinct values and TIME_ID has 1096 Distinct Values. In my previous blog (Part I), we discussed about SELECTIVITY, which in this case is 1/NDV for each of the columns in the table. Selectivity is very critical, as it drives the Access Path and is used to calculate the Cardinality, which drives the Access Order. Therefore, accurate calculation of Selectivity is very critical for the Optimizer.

Now, let us run our queries against each of these three columns and check the Optimizer calculation of Expected Rows against the Actual Rows. The queries will be on EQUALITY, LESS THAN and GREATER THAN predicated. Please refer to my previous blog for the calculation of SELECTIVITY for each of these predicate types. The effective CARDINALITY = SELECTIVITY X NUM_ROWS. Here we go with the first column (TIME_ID).

First lets check the Low_Value and High_Value for the TIME_ID column. These values are used for Range Predicate queries to calculate the Available Range (High_Value – Low_Value).

with function get_date(n_raw in raw) return date
as
	l_date        date;
begin
	dbms_stats.convert_raw_value(n_raw,l_date);
	return l_date;
end;
select	column_name,
	get_date(low_value) lo_value,
	get_date(high_value) hi_value
from	dba_tab_columns
where	owner='SCOTT'
and	table_name='TEST_SELECTIVITY'
and	data_type='DATE'
order by 1;
/

COLUMN_NAME        LO_VALUE             HI_VALUE
------------------ -------------------- --------------------
TIME_ID            13-AUG-2013 00:00:00 12-AUG-2016 00:00:00

Function in WITH clause is a 12c new feature. For Oracle Database versions prior to 12c, create the function using CREATE FUNCTION clause and then used it in the query.

For the Equality Predicate, SELECTIVITY is 1/Num_Distinct and CARDINALITY = SELECTIVITY X NUM_ROWS. After calculating these, we will then run the query on this table to validate the actual number or rows.

## EQUALITY PREDICATE
SQL> select 1/&&ndv Selectivity from dual;
old   1: select 1/&&ndv Selectivity from dual
new   1: select 1/      1096 Selectivity from dual

SELECTIVITY
-----------
 .000912409

SQL> select round(&&selective*800000,0) cardinality from dual;
old   1: select round(&&selective*800000,0) cardinality from dual
new   1: select round(.000912409*800000,0) cardinality from dual

CARDINALITY
-----------
        730

SQL> set autot trace
SQL> select cust_id, amount_sold, promo_id from test_selectivity
  2  where time_id=to_date('11-DEC-2015','DD-MON-YYYY');

704 rows selected.

Elapsed: 00:00:00.99

Execution Plan
----------------------------------------------------------
Plan hash value: 4083831454

--------------------------------------------------------------------------------------
| Id  | Operation         | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                  |   730 | 15330 |  1088  (20)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST_SELECTIVITY |   730 | 15330 |  1088  (20)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TIME_ID"=TO_DATE(' 2015-12-11 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))


SQL> set autot off

Assumption – 730 Rows and Actual – 704 Rows (Nearly Accurate).

Next, we run a query with Less Than Predicate. The SELECTIVITY in this case will be Required_Range/Available_Range (see part I for the exact formula – Required_Range will be computed as Required_Date – Low_Value). CARDINALITY is again SELECTIVITY x NUM_ROWS

## LESS THAN PREDICATE

## Required_Range - 850 Days
SQL> select to_date('11-DEC-2015','DD-MON-YYYY')-to_date('&&min_value','DD-MON-YYYY HH24:MI:SS') req_range
  2  from       dual;
old   1: select to_date('11-DEC-2015','DD-MON-YYYY')-to_date('&&min_value','DD-MON-YYYY HH24:MI:SS') req_range
new   1: select to_date('11-DEC-2015','DD-MON-YYYY')-to_date('13-AUG-2013 00:00:00','DD-MON-YYYY HH24:MI:SS') req_range

 REQ_RANGE
----------
       850

## Available_Range - 1095 Days
SQL> select to_date('&&max_value','DD-MON-YYYY HH24:MI:SS')-to_date('&&min_value','DD-MON-YYYY HH24:MI:SS') avl_range
  2  from       dual;
old   1: select to_date('&&max_value','DD-MON-YYYY HH24:MI:SS')-to_date('&&min_value','DD-MON-YYYY HH24:MI:SS') avl_range
new   1: select to_date('12-AUG-2016 00:00:00','DD-MON-YYYY HH24:MI:SS')-to_date('13-AUG-2013 00:00:00','DD-MON-YYYY HH24:MI:SS') avl_range

 AVL_RANGE
----------
      1095


## Selectivity - Required_Range/Available_Range
SQL> select &&r_range/&&a_range Selectivity from dual;
old   1: select &&r_range/&&a_range Selectivity from dual
new   1: select        850/      1095 Selectivity from dual

SELECTIVITY
-----------
 .776255708

## Assumed Cardinality
SQL> select round(&&selective*800000,0) cardinality from dual;
old   1: select round(&&selective*800000,0) cardinality from dual
new   1: select round(.776255708*800000,0) cardinality from dual

CARDINALITY
-----------
     621005


SQL> select cust_id, amount_sold, promo_id from test_selectivity
  2  where time_id<to_date('11-DEC-2015','DD-MON-YYYY');

620764 rows selected.

Elapsed: 00:00:06.06

Execution Plan
----------------------------------------------------------
Plan hash value: 4083831454

--------------------------------------------------------------------------------------
| Id  | Operation         | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                  |   621K|    12M|  1125  (23)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST_SELECTIVITY |   621K|    12M|  1125  (23)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TIME_ID"<TO_DATE(' 2015-12-11 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))

Assumption – 621k Rows and Actual – 620k Rows (Nearly Accurate).

Next, the query with Greater Than Predicate and the SELECTIVITY will be again Required_Range/Available_Range. The difference, in this case, will be that High_Value will be used to calculate the Required_Range.

## GREATER THAN PREDICATE

## Required_Range - 245 Days
SQL> select to_date('&&max_value','DD-MON-YYYY HH24:MI:SS')-to_date('11-DEC-2015','DD-MON-YYYY') req_range
  2  from       dual;
old   1: select to_date('&&max_value','DD-MON-YYYY HH24:MI:SS')-to_date('11-DEC-2015','DD-MON-YYYY') req_range
new   1: select to_date('12-AUG-2016 00:00:00','DD-MON-YYYY HH24:MI:SS')-to_date('11-DEC-2015','DD-MON-YYYY') req_range

 REQ_RANGE
----------
       245

## Available_Range - 1095 Days
SQL> select to_date('&&max_value','DD-MON-YYYY HH24:MI:SS')-to_date('&&min_value','DD-MON-YYYY HH24:MI:SS') avl_range
  2  from       dual;
old   1: select to_date('&&max_value','DD-MON-YYYY HH24:MI:SS')-to_date('&&min_value','DD-MON-YYYY HH24:MI:SS') avl_range
new   1: select to_date('12-AUG-2016 00:00:00','DD-MON-YYYY HH24:MI:SS')-to_date('13-AUG-2013 00:00:00','DD-MON-YYYY HH24:MI:SS') avl_range

 AVL_RANGE
----------
      1095

## Selectivity - Required_Range/Available_Range
SQL> select &&r_range/&&a_range Selectivity from dual;
old   1: select &&r_range/&&a_range Selectivity from dual
new   1: select        245/      1095 Selectivity from dual

SELECTIVITY
-----------
 .223744292

## Assumed Cardinality 
SQL> select round(&&selective*800000,0) cardinality from dual;
old   1: select round(&&selective*800000,0) cardinality from dual
new   1: select round(.223744292*800000,0) cardinality from dual

CARDINALITY
-----------
     178995

SQL> set autot trace
SQL> select cust_id, amount_sold, promo_id from test_selectivity
  2  where time_id>to_date('11-DEC-2015','DD-MON-YYYY');

178532 rows selected.

Elapsed: 00:00:02.01

Execution Plan
----------------------------------------------------------
Plan hash value: 4083831454

--------------------------------------------------------------------------------------
| Id  | Operation         | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                  |   178K|  3670K|  1099  (21)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST_SELECTIVITY |   178K|  3670K|  1099  (21)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TIME_ID">TO_DATE(' 2015-12-11 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))

Assumption – 178k Rows and Actual – 178k Rows (Accurate).

For Time_ID Column, the expected and actual cardinality were nearly accurate. Let’s shift our focus to the other column – AMOUNT_SOLD. We will run similar three queries – Equality, Less Than and Greater Than.

Before we execute the queries against this column, lets check the statistics (Density, Low_Value and High_Value).

## FOR AMOUNT_SOLD COLUMN

select	column_name, 
		num_distinct, 
		num_nulls,
		density, 
		histogram
from	dba_tab_columns
where	owner='SCOTT'
and		table_name='TEST_SELECTIVITY'
and		column_name='AMOUNT_SOLD';

COLUMN_NAME        NUM_DISTINCT  NUM_NULLS    DENSITY HISTOGRAM
------------------ ------------ ---------- ---------- --------------------
AMOUNT_SOLD                 636          0 .001572327 NONE

with function get_number(n_raw in raw) return number
as
	l_number        number;
begin
	dbms_stats.convert_raw_value(n_raw,l_number);
	return l_number;
end;
select	column_name,
	get_number(low_value) lo_value,
	get_number(high_value) hi_value
from	dba_tab_columns
where	owner='SCOTT'
and	table_name='TEST_SELECTIVITY'
and	column_name='AMOUNT_SOLD';
/

COLUMN_NAME          LO_VALUE   HI_VALUE
------------------ ---------- ----------
AMOUNT_SOLD                 6       1783

All the calculation are same for this column as well.

## EQUALITY PREDICATE

SQL> select 1/&&ndv Selectivity from dual;
old   1: select 1/&&ndv Selectivity from dual
new   1: select 1/       636 Selectivity from dual

SELECTIVITY
-----------
 .001572327

SQL> select round(&&selective*800000,0) cardinality from dual;
old   1: select round(&&selective*800000,0) cardinality from dual
new   1: select round(.001572327*800000,0) cardinality from dual

CARDINALITY
-----------
       1258

SQL> set autot trace
SQL> select cust_id, amount_sold, promo_id from test_selectivity
  2  where amount_sold=1500;

122 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 4083831454

--------------------------------------------------------------------------------------
| Id  | Operation         | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                  |  1258 | 16354 |  1136  (24)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST_SELECTIVITY |  1258 | 16354 |  1136  (24)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("AMOUNT_SOLD"=1500)

Assumption – 1258 Rows and Actual – 122 Rows (Out by 10 times).

## LESS THAN PREDICATE

## Required_Range
SQL> select (1500-&&min_value) req_range from dual;
old   1: select (1500-&&min_value) req_range from dual
new   1: select (1500-         6) req_range from dual

 REQ_RANGE
----------
      1494

## Available_Range
SQL> select (&&max_value-&&min_value) avl_range from dual;
old   1: select (&&max_value-&&min_value) avl_range from dual
new   1: select (      1783-         6) avl_range from dual

 AVL_RANGE
----------
      1777


## Selectivity - Required_Range/Available_Range
SQL> select &&r_range/&&a_range Selectivity from dual;
old   1: select &&r_range/&&a_range Selectivity from dual
new   1: select       1494/      1777 Selectivity from dual

SELECTIVITY
-----------
 .840742825

## Assumed Cardinality
SQL> select round(&&selective*800000,0) Cardinality from dual;
old   1: select round(&&selective*800000,0) Cardinality from dual
new   1: select round(.840742825*800000,0) Cardinality from dual

CARDINALITY
-----------
     672594

SQL> set autot trace
SQL> select cust_id, amount_sold, promo_id from test_selectivity
  2  where amount_sold<1500;

791950 rows selected.

Elapsed: 00:00:07.57

Execution Plan
----------------------------------------------------------
Plan hash value: 4083831454

--------------------------------------------------------------------------------------
| Id  | Operation         | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                  |   672K|  8538K|  1136  (24)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST_SELECTIVITY |   672K|  8538K|  1136  (24)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("AMOUNT_SOLD"<1500)

Assumption – 672k Rows and Actual – 791k Rows (Inaccurate).

## GREATER THAN PREDICATE


## Required_Range
SQL> select (&&max_value-1500) req_range from dual;
old   1: select (&&max_value-1500) req_range from dual
new   1: select (      1783-1500) req_range from dual

 REQ_RANGE
----------
       283

## Available_range
SQL> select (&&max_value-&&min_value) avl_range from dual;
old   1: select (&&max_value-&&min_value) avl_range from dual
new   1: select (      1783-         6) avl_range from dual

 AVL_RANGE
----------
      1777

## Selectivity - Required_Range/Available_Range
SQL> select &&r_range/&&a_range Selectivity from dual;
old   1: select &&r_range/&&a_range Selectivity from dual
new   1: select        283/      1777 Selectivity from dual

SELECTIVITY
-----------
 .159257175

## Assumed Cardinality
SQL> select round(&&selective*800000,0) Cardinality from dual;
old   1: select round(&&selective*800000,0) Cardinality from dual
new   1: select round(.159257175*800000,0) Cardinality from dual

CARDINALITY
-----------
     127406

SQL> select cust_id, amount_sold, promo_id from test_selectivity
  2  where amount_sold>1500;

7928 rows selected.

Elapsed: 00:00:00.12

Execution Plan
----------------------------------------------------------
Plan hash value: 4083831454

--------------------------------------------------------------------------------------
| Id  | Operation         | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                  |   127K|  1617K|  1136  (24)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST_SELECTIVITY |   127K|  1617K|  1136  (24)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("AMOUNT_SOLD">1500)

Assumption – 127k Rows and Actual – 7928 Rows (Significantly Out).

Finally, we move to our last column of interest i.e. PROMO_ID. The distinct values in this column are 4 and the data distribution is as under:

## FOR PROMO_ID Column

select	promo_id, 
	count(*) cnt,
	round(ratio_to_report(count(*)) over()*100,2) "%age" 
from	test_selectivity
group by promo_id
order by 2;

  PROMO_ID        CNT       %age
---------- ---------- ----------
        33       2074        .26
       351       2245        .28
       350      17978       2.25
       999     777703      97.21

select	column_name, 
	num_distinct, 
	num_nulls, 
	density,
	histogram
from	dba_tab_columns
where	owner='SCOTT'
and	table_name='TEST_SELECTIVITY'
and	column_name='PROMO_ID'
order by 1;

COLUMN_NAME        NUM_DISTINCT  NUM_NULLS    DENSITY HISTOGRAM
------------------ ------------ ---------- ---------- --------------------
PROMO_ID                      4          0        .25 NONE

For this column, the we will execute 4 queries and each of these will be EQUALITY Predicates. For Equality Predicates, the calculation for SELECTIVITY is simple, which is 1/NDV or DENSITY. From DBA_TAB_COLUMNS, we can see that the DENSITY for this column is 0.25 (1/4).

## FOR PROMO_ID

## Selectivity
SQL> select 1/&&ndv selectivity from dual;
old   1: select 1/&&ndv selectivity from dual
new   1: select 1/         4 selectivity from dual

SELECTIVITY
-----------
        .25

## Assumed Cardinality
SQL> select round(&&selective*800000,0) cardinality from dual;
old   1: select round(&&selective*800000,0) cardinality from dual
new   1: select round(       .25*800000,0) cardinality from dual

CARDINALITY
-----------
     200000

## VALUE 999
SQL> select cust_id, amount_sold, promo_id from test_selectivity where promo_id=999;

777703 rows selected.

Elapsed: 00:00:06.89

Execution Plan
----------------------------------------------------------
Plan hash value: 4083831454

--------------------------------------------------------------------------------------
| Id  | Operation         | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                  |   200K|  2539K|  1112  (22)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST_SELECTIVITY |   200K|  2539K|  1112  (22)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROMO_ID"=999)

Assumption – 200k Rows and Actual – 777k Rows (Out by 4 time).

## Value 350
SQL> select cust_id, amount_sold, promo_id from test_selectivity where promo_id=350;

17978 rows selected.

Elapsed: 00:00:00.27

Execution Plan
----------------------------------------------------------
Plan hash value: 4083831454

--------------------------------------------------------------------------------------
| Id  | Operation         | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                  |   200K|  2539K|  1112  (22)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST_SELECTIVITY |   200K|  2539K|  1112  (22)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROMO_ID"=350)

Assumption – 200k Rows and Actual – 17978 Rows (Significantly Out).

## VALUE 33
SQL> select cust_id, amount_sold, promo_id from test_selectivity where promo_id=33;

2074 rows selected.

Elapsed: 00:00:00.22

Execution Plan
----------------------------------------------------------
Plan hash value: 4083831454

--------------------------------------------------------------------------------------
| Id  | Operation         | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                  |   200K|  2539K|  1112  (22)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST_SELECTIVITY |   200K|  2539K|  1112  (22)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROMO_ID"=33)

Assumption – 200k Rows and Actual – 2074 Rows (Significantly Out).

## VALUE 351
SQL> select cust_id, amount_sold, promo_id from test_selectivity where promo_id=351;

2245 rows selected.

Elapsed: 00:00:00.11

Execution Plan
----------------------------------------------------------
Plan hash value: 4083831454

--------------------------------------------------------------------------------------
| Id  | Operation         | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                  |   200K|  2539K|  1112  (22)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST_SELECTIVITY |   200K|  2539K|  1112  (22)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("PROMO_ID"=351)

Assumption – 200k Rows and Actual – 2245 Rows (Significantly Out).

Summary – So far

  • TIME_ID – Assumptions v/s Actuals 
  • AMOUNT_SOLD – Assumptions v/s Actuals Χ
  • PROMO_ID – Assumptions v/s Actuals Χ

Let’s make a simple change in the TIME_ID column. For this, I will create another table, which will be a replica of TEST_SELECTIVITY. We will make this change in the new table, so that, we do not disturb the Original Table.

create table test_selectivity_m as
select * from test_selectivity;

update test_selectivity_m set time_id=to_date('31-Dec-2050','DD-MON-YYYY')
where rownum exec dbms_stats.gather_table_stats(user,'TEST_SELECTIVITY_M');

select table_name, num_rows, blocks, partitioned from dba_tables
where table_name in ('TEST_SELECTIVITY','TEST_SELECTIVITY_M');

TABLE_NAME                       NUM_ROWS     BLOCKS PAR
------------------------------ ---------- ---------- ---
TEST_SELECTIVITY_M                 800000       3478 NO
TEST_SELECTIVITY                   800000       3478 NO

I created another table TEST_SELECTIVITY_M and updated a single row with a future date i.e.31st December 2050. Lets see, whether this minor change has any impact on the Optimizer Assumptions v/s Actuals.

with function get_date(n_raw in raw) return date
as
	l_date        date;
begin
	dbms_stats.convert_raw_value(n_raw,l_date);
	return l_date;
end;
select	column_name,
	num_distinct,
	num_nulls,
	density,
	histogram,
	get_date(low_value) lo_value,
	get_date(high_value) hi_value
from	dba_tab_columns
where	owner='SCOTT'
and	table_name='TEST_SELECTIVITY_M'
and	data_type='DATE'
order by 1;
/
COLUMN_NAME                    NUM_DISTINCT  NUM_NULLS    DENSITY HISTOGRAM            LO_VALUE             HI_VALUE
------------------------------ ------------ ---------- ---------- -------------------- -------------------- --------------------
TIME_ID                                1097          0 .000911577 NONE                 13-AUG-2013 00:00:00 31-DEC-2050 00:00:00

Now, lets run the three queries on TIME_ID column against this table and see the results.

## EQUALITY PREDICATE

## Selectivity
SQL> select 1/&&ndv Selectivity from dual;
old   1: select 1/&&ndv Selectivity from dual
new   1: select 1/      1097 Selectivity from dual

SELECTIVITY
-----------
 .000911577

## Assumed Cardinality
SQL> select round(&&selective*800000,0) cardinality from dual;
old   1: select round(&&selective*800000,0) cardinality from dual
new   1: select round(.000911577*800000,0) cardinality from dual

CARDINALITY
-----------
        729

SQL> set autot trace
SQL> select cust_id, promo_id, amount_sold from test_selectivity_m
  2  where time_id=to_date('11-DEC-2015','DD-MON-YYYY');

704 rows selected.

Elapsed: 00:00:00.57

Execution Plan
----------------------------------------------------------
Plan hash value: 3843949181

----------------------------------------------------------------------------------------
| Id  | Operation         | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                    |   729 | 15309 |  1088  (20)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST_SELECTIVITY_M |   729 | 15309 |  1088  (20)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TIME_ID"=TO_DATE(' 2015-12-11 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))

Assumption – 729 Rows and Actual – 704 Rows (Nearly Accurate).

## LESS THAN PREDICATE

## Required_Range - 850 Days
SQL> select to_date('11-DEC-2015','DD-MON-YYYY')-to_date('&&min_value','DD-MON-YYYY HH24:MI:SS') req_range
  2  from       dual;
old   1: select to_date('11-DEC-2015','DD-MON-YYYY')-to_date('&&min_value','DD-MON-YYYY HH24:MI:SS') req_range
new   1: select to_date('11-DEC-2015','DD-MON-YYYY')-to_date('13-AUG-2013 00:00:00','DD-MON-YYYY HH24:MI:SS') req_range

 REQ_RANGE
----------
       850

## Available_Range - 13654 Days
SQL> select to_date('&&max_value','DD-MON-YYYY HH24:MI:SS')-to_date('&&min_value','DD-MON-YYYY HH24:MI:SS') avl_range
  2  from       dual;
old   1: select to_date('&&max_value','DD-MON-YYYY HH24:MI:SS')-to_date('&&min_value','DD-MON-YYYY HH24:MI:SS') avl_range
new   1: select to_date('31-DEC-2050 00:00:00','DD-MON-YYYY HH24:MI:SS')-to_date('13-AUG-2013 00:00:00','DD-MON-YYYY HH24:MI:SS') avl_range

 AVL_RANGE
----------
     13654

## Selectivity - Required_Range/Available_Range
SQL> select &&r_range/&&a_range Selectivity from dual;
old   1: select &&r_range/&&a_range Selectivity from dual
new   1: select        850/     13654 Selectivity from dual

SELECTIVITY
-----------
  .06225282

## Assumed Cardinality
SQL> select round(&&selective*800000,0) cardinality from dual;
old   1: select round(&&selective*800000,0) cardinality from dual
new   1: select round( .06225282*800000,0) cardinality from dual

CARDINALITY
-----------
      49802

SQL> select cust_id, promo_id, amount_sold from test_selectivity_m
  2  where time_id<to_date('11-DEC-2015','DD-MON-YYYY');

620764 rows selected.

Elapsed: 00:00:06.41

Execution Plan
----------------------------------------------------------
Plan hash value: 3843949181

----------------------------------------------------------------------------------------
| Id  | Operation         | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                    | 49802 |  1021K|  1091  (21)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST_SELECTIVITY_M | 49802 |  1021K|  1091  (21)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TIME_ID"<TO_DATE(' 2015-12-11 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))

Assumption – 49k Rows and Actual – 620k Rows (Significantly Out).

## GREATER THAN PREDICATE

## Required_Range - 12804 Days
SQL> select to_date('&&max_value','DD-MON-YYYY HH24:MI:SS')-to_date('11-DEC-2015','DD-MON-YYYY') req_range
  2  from       dual;
old   1: select to_date('&&max_value','DD-MON-YYYY HH24:MI:SS')-to_date('11-DEC-2015','DD-MON-YYYY') req_range
new   1: select to_date('31-DEC-2050 00:00:00','DD-MON-YYYY HH24:MI:SS')-to_date('11-DEC-2015','DD-MON-YYYY') req_range

 REQ_RANGE
----------
     12804

## Available_range - 13654 Days
SQL> select to_date('&&max_value','DD-MON-YYYY HH24:MI:SS')-to_date('&&min_value','DD-MON-YYYY HH24:MI:SS') avl_range
  2  from       dual;
old   1: select to_date('&&max_value','DD-MON-YYYY HH24:MI:SS')-to_date('&&min_value','DD-MON-YYYY HH24:MI:SS') avl_range
new   1: select to_date('31-DEC-2050 00:00:00','DD-MON-YYYY HH24:MI:SS')-to_date('13-AUG-2013 00:00:00','DD-MON-YYYY HH24:MI:SS') avl_range

 AVL_RANGE
----------
     13654

## Selectivity - Required_range/Available_range
SQL> select &&r_range/&&a_range Selectivity from dual;
old   1: select &&r_range/&&a_range Selectivity from dual
new   1: select      12804/     13654 Selectivity from dual

SELECTIVITY
-----------
  .93774718

## Assumed Cardinality
SQL> select round(&&selective*800000,0) cardinality from dual;
old   1: select round(&&selective*800000,0) cardinality from dual
new   1: select round( .93774718*800000,0) cardinality from dual

CARDINALITY
-----------
     750198

SQL> select cust_id, promo_id, amount_sold from test_selectivity_m
  2  where time_id>to_date('11-DEC-2015','DD-MON-YYYY');

178532 rows selected.

Elapsed: 00:00:02.15

Execution Plan
----------------------------------------------------------
Plan hash value: 3843949181

----------------------------------------------------------------------------------------
| Id  | Operation         | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                    |   750K|    15M|  1133  (24)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| TEST_SELECTIVITY_M |   750K|    15M|  1133  (24)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TIME_ID">TO_DATE(' 2015-12-11 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))       

Assumption – 750k Rows and Actual – 178k Rows (Out by 4 Times).

Final Summary:

  • TIME_ID – Assumptions v/s Actuals 
  • TIME_ID – Assumptions v/s Actuals (minor change) Χ
  • AMOUNT_SOLD – Assumptions v/s Actuals Χ
  • PROMO_ID – Assumptions v/s Actuals Χ

A minor change in the data had changed our Summary on Time_ID column. In case of our TEST_SELECTIVITY Table, the Optimizer estimation for Time_ID was nearly accurate, whereas, for other 2 columns, it was way out. What could be the reason ? Remember, accurate Selectivity and Cardinality is critical as it can have an impact on Access Path and Access Order respectively. Any discrepancy can cause a sub-optimal plan. Optimizer is a piece of code and completely depends on the Statistics that we gather and provide to it as an input. Therefore, the solution to these discrepancies has to be with us. In the next blog, which will be Part III of this series, we will cover the problem and the solution to fix this discrepancy, but that fix will cause problems for few cases and we shall cover those as well.

Optimizer – Part I

Just concluded a Full Day Event on Performance in Chandigarh for the North India Chapter of “All India Oracle User Group”. As committed in my earlier user group update blog, I thought for the benefit of the readers, posting the technical details on Optimizer, especially histograms, would help. Another reason, this is important is these sessions are attended by Developer communities as well and grasping everything in one session is very difficult. writeup will help them understand this critical piece.

I usually blog on real life challenges. The motivation behind this blog as well is an issue that I was working on and the fix that I applied to resolve it. This will be discussed at a relevant time and in a relevant part of this series.

I will publish a 4 part series, starting with the basics of Optimizer, the formulaes and then few examples with its various calculations. The series will be divided into 4 parts, such as :

  1. Optimizer Basics & Calculations
  2. Cardinality – Actuals v/s Assumed
  3. Histograms – Frequency & Height Balanced
  4. 12c Enhancements – Hybrid and TopN Frequency Histograms

In the context of the Optimizer, the two terminologies commonly used are SELECTIVITY and CARDINALITY.

SELECTVITY :It is measured as a percentage of rows that would be returned from or filtered out of a row set. Thus the Selectivity of a predicate indicates how many rows pass a predicate test. Selectivity ranges from 0.0 to 1.0. A Selectivity of 0.0 means no rows are selected from a row set, whereas a selectivity of 1.0 means all the rows are selected. A predicate become more Selective as the values approaches 0.0 and less selective if it approaches 1.0. It drives the Access Path for example, tablescan or an Index Scan. With No Histograms, Selectivity for a column is computed as 1/NUM_DISTINCT or DENSITY.

CARDINALITY :Cardinality is the estimated number of rows returned by each operation in an Execution Plan. The Optimizer determines cardinality for each operation based on complex set of formulas that use both, the table and column level statistics or dynamic statistics. Cardinality estimates must be as accurate as possible because they influence all aspects of an execution plan. Cardinality is important when the Optimizer determines the cost of a Join. For example, in a Nested Loop Join between an EMP and DEPT table, the number of rows returned by EMP table determines how often the DEPT table will be probed. Cardinality drives the Access Order.

Just to simplify, for a Gender column with M & F (2 Distinct Values), the selectivity will be 1/2 or 0.5. If this table has around 100 rows, then the Cardinality will be Selectivity X Num_Rows, which is 0.5 x 100 = 50.

For a 100 row table with 2 columns, each with distinct values as 4 and 2, the combined selectivity (for AND predicate) will be 0.5 X 0.25 = 0.125 and the Cardinality will be 0.125 X 100 = 12.5 rounded off to 13.

Selectivity Calculation

Assume Column C
NDV is the Number of Distinct Values
minv is the Minumim Value for C
maxv is the Maximum Value for C

The formulae for Selectivity Calculation would be as under :

  • SEL(=C) = 1/NDV or DENSITY
  • SEL(<C) = (C-minv)/(maxv-minv)
  • SEL(<=C) = SEL(=C) + SEL(<C)
  • SEL(>C) = (maxv-C)/(maxv-minv)
  • SEL(>=C) = SEL(=C) + SEL(>C)

In case of a Range Predicate (<, , >=), the Numerator part is called as the Required Range and the Denominator Part is called as an Available Range.

Once the Selectivity is derived, Cardinality will be Selectivity multiplied by the Number of Rows. For multiple predicates involved, the Selectivity of each of these is derived based on above formulas and then used based on AND or OR predicates. For example :

  • WHERE A=:b1 and B=:b2 = SEL(=A) X SEL(=B)
  • WHERE A=:b1 and B>=:b2 = SEL(=A) X (SEL(=B) + SEL(>B))
  • WHERE A=:b1 or B=:b2 = SEL(=A) + SEL(=B) – (SEL(=A) X SEL(=B))

This was the first part of this series. In the next part, we will create a sample table and run through each of these formulas.

OTNYathra 2016

Presented almost after a gap of 5 years at Chennai. It was a good crowd. Now turn for the Mumbai event on 30th April 2016. I plan to blog a series of fundamentals on Optimizer with basics to Histograms for my readers. I will start this post my Mumbai Event. This will help the attendees to understand the concept well.

In-Memory Store – Push Down Optimization

Last week, I concluded my first ever Event in Gurgaon. This was a full day event on Performance Tuning. During the session, on Query Transformation, I mentioned about “Testing rather than Believing”. The rationale behind this is : It is not always guaranteed that the test case and the transformation discussed will be reproduced in your environment. It is not that the transformation and the test cases are wrong. It is just that there are environmental differences that can cause this. In that case, you would be able to work on the solution to make the transformation happen and therefore, testing will give you good enough knowledge of various transformations or Database Features / Options. My current blog in on one such Database feature : In-Memory Store – Push Down Predicate.”

In-Memory Store is introduced in 12.1.0.2 and is intended to read the required data from a new memory area, which is also a part of the SGA. This memory is sized by way of INMEMORY_SIZE database parameter. In-Memory Store caches the data in a Columnar format as against Row format used by our traditional Buffer Cache. There are many benefits to it. Buffer Cache is still used for our OLTP applications, whereas, queries processing large data and few columns can benefit from In-Memory Store. Biggest advantage being, both these can be implemented on a Single Database thus requiring no complex Data Transformation.

While working on some of the features of In-Memory, I came across Push-Down Optimization. This optimization pushes the predicates, aggregations and group-bys to the access layer i.e.at the time of scanning the column or group of columns, returning just the small subset of data to the query layer. The number of rows returned to the Query layer depends on the number of In-Memory Compression Unit. Thus, the amount of data to be process by the Query layer is reduced drastically making the queries much more efficient and faster. At this point, I will direct you to In-Memory Blogs which is maintained by Oracle Development Folks. The explanation in this blog is self explanatory, therefore, I would not publish this in my blog. I don’t want to make a copy of well-written blog:). The idea behind this blog post is to let the readers know of the issue that I faced while testing this.

As mentioned earlier, I test whatever is published and this gives me a better understanding of the feature. While testing this, I could reproduce the results mentioned in this Blog Post.

For this, I created my own LINEORDERS table and executed the queries mentioned in the blog. The results are published as under :

## STATS table to hold the Statistics from V$MYSTAT

create global temporary table stats on commit delete rows as
select s.sid, n.name, s.value
from   v$mystat s, v$statname n
where  s.statistic#=n.statistic#
and    1=2;

  CREATE TABLE LINEORDER
   (    LO_ORDERID NUMBER,
        LO_PRODUCTNAME VARCHAR2(128) NOT NULL ENABLE,
        LO_SHIPQTY NUMBER,
        LO_ORDERVALUE NUMBER,
        LO_SHIPMODE VARCHAR2(4)
   ) 
  TABLESPACE USERS
  INMEMORY PRIORITY CRITICAL MEMCOMPRESS FOR QUERY LOW
  DISTRIBUTE AUTO NO DUPLICATE
  NO INMEMORY (LO_ORDERID)
  NO INMEMORY (LO_PRODUCTNAME);

insert into lineorder
select rownum lo_orderid, a.object_name lo_productname,
       round(dbms_random.value(1,100),0) lo_shipqty,
       round(dbms_random.value(1000,10000),2) lo_ordervalue,
        case when mod(rownum,3)=0 then 'AIR'
            when mod(rownum,7)=0 then 'SHIP'
            when mod(rownum,10)=0 then 'RAIL'
        else 'ROAD' end lo_shipmode
from    all_objects a, all_objects b
where rownum<=1e+7;

commit;
exec dbms_stats.gather_table_stats(user,'LINEORDER');

SQL> select num_rows, blocks, inmemory from dba_tables where table_name='LINEORDER';

  NUM_ROWS     BLOCKS INMEMORY
---------- ---------- --------
  10000000      41717 ENABLED

SQL> select segment_name, bytes, inmemory_size, bytes_not_populated, populate_status
from v$im_segments;
  2  
SEGMENT_NAME                        BYTES INMEMORY_SIZE BYTES_NOT_POPULATED POPULATE_
------------------------------ ---------- ------------- ------------------- ---------
LINEORDER                       343932928     133562368                   0 COMPLETED

On my laptop, I have 4 GB memory and therefore, had to restrict my SGA Size. The tablespace size is around 343MB. I have sized my In-Memory Store to 200MB and therefore, the table was created with PCFTREE as 1, so as to, have this table as small as possible. With default PCTFREE, the entire table is not populated into the Store.

Now, lets start our test, execute the queries as per the blog and validate the results.

SQL> insert into stats
select s.sid, n.name, s.value
from   v$mystat s, v$statname n
where  s.statistic#=n.statistic#
and    n.name like 'IM%';
211 rows created.

SQL> select /*+ VIVEK_IMCU */ lo_shipmode, count(*) from lineorder
group by lo_shipmode;

LO_S   COUNT(*)
---- ----------
RAIL     571429
SHIP     952381
ROAD    5142857
AIR     3333333

SQL>select a.name, m.value - a.value value
from    v$mystat m, stats a, v$statname b
where a.name = b.name
and   m.statistic#=b.statistic#
and   m.value - a.value >0
order by 1;
  
NAME                                                                  VALUE
---------------------------------------------------------------- ----------
IM scan CUs columns accessed                                             21
IM scan CUs columns theoretical max                                     105
IM scan CUs memcompress for query low                                    21
IM scan CUs no cleanout                                                  21
IM scan CUs split pieces                                                 23
IM scan bytes in-memory                                           116692649
IM scan bytes uncompressed                                        126565381
IM scan rows                                                       10000000
IM scan rows projected                                             10000000
IM scan rows valid                                                 10000000

As per the Blog, the value for “IM scan rows projected” statistics should have been 84 (in my case). The statistics show that the query accessed 21 Compression Units (IM scan CUs memcompress for query low). I have 4 values for lo_shipmode i.e AIR, SHIP, ROAD & RAIL. So, this make 21 x 4 = 84. However, the statistics “IM scan rows projected”, from my testing, is the number of rows in the table (10 Million). Why is this discrepancy ? Is my test case wrong ?

I executed other queries as well and the results were the same i.e.I could not reproduce the statistics mentioned in the blog. I was sure that there is some mismatch in the configuration. Usually, for any such issues, customers are advised to be on latest Bundle Patch. Therefore, I applied Bundle Patch 10 as well, but the results were same. It took a while to investigate the cause of this.

I reported this to the author of the blog and they immediately started investigation on this. For analysis, I had sent the table creation script, along with the run time execution plan. Interestingly, the issue was drilled down to database parameter setting STATISTICS_LEVEL. This setting on my database was ALL. With default or TYPICAL, I was able to reproduce the results as per the blog.

SQL> show parameter statistics_level

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
statistics_level                     string      ALL

SQL> alter session set statistics_level=typical;

Session altered.

SQL> insert into stats
select s.sid, n.name, s.value
from   v$mystat s, v$statname n
where  s.statistic#=n.statistic#
and    n.name like 'IM%';

211 rows created.

SQL> select /*+ VIVEK_IMCU */ lo_shipmode, count(*) from lineorder
group by lo_shipmode;

LO_S   COUNT(*)
---- ----------
RAIL     571429
SHIP     952381
ROAD    5142857
AIR     3333333

SQL> select a.name, m.value - a.value value
from    v$mystat m, stats a, v$statname b
where a.name = b.name
and   m.statistic#=b.statistic#
and   m.value - a.value >0
order by 1;

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
IM scan CUs columns accessed                                             21
IM scan CUs columns theoretical max                                     105
IM scan CUs memcompress for query low                                    21
IM scan CUs no cleanout                                                  21
IM scan CUs split pieces                                                 23
IM scan bytes in-memory                                           116692649
IM scan bytes uncompressed                                        126565381
IM scan rows                                                       10000000
IM scan rows projected                                                   84
IM scan rows valid                                                 10000000

10 rows selected.

To summarize, STATISTICS_LEVEL=ALL causes a problem for Push Down Optimization. BUG has been filed for this issue and the resolution/fix is expected soon. Will keep you posted.

The idea behind this blog was just to let the readers know the importance of Testing before concluding. Each Database setup is different. A minor change can cause a change in the behaviour and it will be in the interest of the readers to investigate this change.

Performance Tuning Day – Gurgaon ! My First ever event in Gurgaon

I had presented 2 full day events in Pune. These were on Performance Optimization. AIOUG is now replicating this same event in Gurgaon. I will be presenting a full day “Performance Tuning Day” on 17th October 2015. I had been speaking for the User Group for almost 8 years now, however, in Delhi/Gurgaon region, this will be my first ever presentation. Looking forward for a great crowd.

Registration Link

Performance Tuning Day Part II – Pune Chapter of All India Oracle User Group

Mark your calendar for the Part II of my Performance Tuning Day Event at Pune. This is scheduled for 12th September 2015. Registration link is open. No worries for those who missed my previous session (Part I) as I have a re-cap of my the previous session. See you all on 12th September.

Click for Registration

Performance Tuning Day…All India Oracle User Group

Concluded a 5 hour session on Performance Optimization for Pune Chapter of All India Oracle User Group. Due to unexpected massive traffic jam, reached the venue late and therefore, the sessions were behind the actual schedule. Had to cancel a session on Query Optimizer. A Big Disappointment for this cancellation. As a speaker, it really disappoints when you have to run few slides and/or cancel a session. Sorry folks. Will check with the Organizer for the Part II of this event, where we can cover this interesting topic.

The crowd, as expected, was interesting. They made the whole event Interactive with interesting questions. As a Speaker, you enjoy if your participants are deeply involved and raise questions to clarify their doubts, which gives you a sense that the crowd is listening to what you are saying. Thanks to all for lighting up the event.

Last but not the least, Hats Off to the Organizers. They worked hard to make this event a grand success. Their meticulous planning is appreciated.

AIOUG Performance Tuning Day – Pune

I am presenting a full day event on 8th August 2015 at Pune. This is a Performance tuning day that will cover some of the interesting performance issues and solutions. Looking forward to see you all at Pune. Click on the following link to view the schedule and to register.

Performance Tuning Day Link

AIOUG Performance Tuning Day ! Hyderabad…

AIOUG is hosting a full day Performance Tuning day on 30th May 2015 in Hyderabad. I will be presenting on Oracle Query Optimizer and Performance. These sessions will be a mix of technical demos and real life examples. Hope to see a large gathering. Registrations are open at Performance Tuning Day.

Auto DOP ! 11g Feature…

Recently, during one of the Performance POC on an Oracle Engineered System, I got to implement Auto DOP feature for one of their critical Batch Job. This Batch job is a a set of multiple processes. Some of the processes run as a Serial Process, some these are just one process, but the query is hinted to use Oracle Parallelism and some of these are parallel application threads.

For AUTO DOP to work, there are two mandatory settings. These are IO Calibration and setting of parallel_degree_policy (which defaults to MANUAL, means no AUTO DOP). I started with AUTO settings as LIMITED requires changing DEGREE for all (or critical huge) tables. AUTO setting means, Optimizer will calculate and come out with the Degree of Parallelism. Further, this setting also enables two features: Parallel_Statement_Queuing and In-Memory Parallel Execution. In-Memory Parallel execution might disable Exadata Smart Scan, therefore, I modified _parallel_cluster_cache_policy to ADAPTIVE i.e no In-Memory Parallel Execution. IO Calibration was done, so that, optimizer can take the advantage of various values (as can be seen from DBA_RSRC_IO_CALIBRATION) to come out with Degree of Parallelism.

My initial test run was not that successful. While investigating this, I came across a very informative article from Gwen Shapira. As per this article, for AUTO DOP considers MAX_PMBPS value only and this can be set manually to 200. In my case, since the calibration process was executed, the value for MAX_PMBPS was around 3000+. As I understand, this value depicts the throughput and therefore, larger the value, lower the DOP will be. I followed the advice in the article and deleted the rows from resource_io_calibrate$.

delete from resource_io_calibrate$;
insert into resource_io_calibrate$ values(current_timestamp, current_timestamp, 0, 0, 200, 0, 0);
commit;

Post setting this value manually, the database needs a restart. This change did the trick. The performance of many of the processes improved drastically. For some of the queries, we could also see Statement Queuing working effectively i.e. waiting for the resources to be available and taking up the required DOP, rather than downgrading the parallel degree. However, the customer was bit apprehensive about AUTO word and therefore, decided to Implement the optimizer calculations of DOP manually into each of these queries and disabling this feature. This change worked as well.

My observations from this nice feature (and I may be wrong as I did not get much time to investigate or work on this), to name a few are :

  • Any Query that takes more than 10 Seconds (Optimizer Estimation) will be subject to Auto DOP
  • A Query with parallel hint will also go through serial execution run time evaluation. This means, if optimizer estimates the runtime of serial execution to be less than 10 seconds, parallel hint will be ignored.
  • In my opinion, AUTO DOP is one of the best innovation.

    Follow

    Get every new post delivered to your Inbox.

    Join 244 other followers

    %d bloggers like this: