AIOUG ! Sangam 14

The Annual AIOUG Event “Sangam 14″ is scheduled for November 2014. I am presenting this year. The Agenda will be published soon on AIOUG Website. Thomast Kyte will be back in India for this event, along side Maria Colgan from Oracle Optimizer Development Team. Syed Jaffar Husain will also be presenting.

Looking forward to meet Oracle Community.

SQL Optimization ! Yet another example ……

This is another classic example where the Developers needs to know a better way of writing a code. I wrote about one such example in my earlier post. There are many ways of achieving a target, but very few to achieve this in an efficient manner.

I came across a performance issue, which triggered acute slowness during peak load. This slowness was evident post increase in data volume. A Query, which was doing around 28k Logical I/O’s was not doing around 48K per execution. The concurrency on this query was so high that in a 30 minutes windows (as seen from the AWR reports), the total number of logical I/O’s, cummulative for all executions, was around 1 Billion. In terms of concurrency, I could see that out of 15 Active sessions, 10 sessions were executing this query concurrently, thus consuming high CPU. I will not post the Original query but will demonstrate an example.

The example is from SH Schema on Customers Table. I have created following Indexes :

create index customers_cust_city_idx on customers(cust_city);
create index customers_cust_state_idx on customers(cust_state_province);

The application requirement is to design a screen for users to retrieve Cust_ID and cust_last_name based on queried cust_city and/or cust_state_province. The user has an option to either query on both or either of theses, with no option of leaving both NULL. To get this done, the developers coded the query as mentioned below. I have also pasted the run time execution plan.

## With CUST_CITY as NULL
variable b1 varchar2(32);
variable b2 varchar2(32);
exec :b1:='Maharashtra';
exec :b2:=null;

select cust_id, cust_last_name
from 	customers
where	(cust_city=:b2 or :b2 is null)
and	(cust_state_province=:b1 or :b1 is null);

SQL_ID  554u5htwuan4z, child number 0
-------------------------------------
select cust_id, cust_last_name from  customers where (cust_city=:b2 or
:b2 is null) and (cust_state_province=:b1 or :b1 is null)

Plan hash value: 2008213504

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |       |       |   406 (100)|          |
|*  1 |  TABLE ACCESS FULL| CUSTOMERS |   162 |  5508 |   406   (1)| 00:00:05 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(((:B2 IS NULL OR "CUST_CITY"=:B2) AND (:B1 IS NULL OR
              "CUST_STATE_PROVINCE"=:B1)))

## With CUST_STATE_PROVINCE as NULL

exec :b1:=null;
exec :b2:='Mumbai Bombay';

SQL_ID  554u5htwuan4z, child number 0
-------------------------------------
select cust_id, cust_last_name from  customers where (cust_city=:b2 or
:b2 is null) and (cust_state_province=:b1 or :b1 is null)

Plan hash value: 2008213504

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |       |       |   406 (100)|          |
|*  1 |  TABLE ACCESS FULL| CUSTOMERS |   162 |  5508 |   406   (1)| 00:00:05 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(((:B2 IS NULL OR "CUST_CITY"=:B2) AND (:B1 IS NULL OR
              "CUST_STATE_PROVINCE"=:B1)))

## With both NON NULL Values

exec :b1:='Maharashtra';
exec :b2:='Mumbai Bombay';

SQL_ID  554u5htwuan4z, child number 0
-------------------------------------
select cust_id, cust_last_name from  customers where (cust_city=:b2 or
:b2 is null) and (cust_state_province=:b1 or :b1 is null)

Plan hash value: 2008213504

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |       |       |   406 (100)|          |
|*  1 |  TABLE ACCESS FULL| CUSTOMERS |   162 |  5508 |   406   (1)| 00:00:05 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(((:B2 IS NULL OR "CUST_CITY"=:B2) AND (:B1 IS NULL OR
              "CUST_STATE_PROVINCE"=:B1)))

For each of these executions, the plan was a Full Table Scan of CUSTOMERS Table. Even if the values for both the bind variables were provided, the query optimizer selected a Full Scan and this is because of the way this query is written. A Full Table Scan will impact the performance and as the volume increases, the impact will gradually increase. One of the better way for writing this query would be to have IF ELSE condition, as I have demonstrated in my previous posts. There is another way as well. See below :


variable b1 varchar2(32);
variable b2 varchar2(32);
exec :b1:='Maharashtra';
exec :b2:=null;

select cust_id, cust_last_name, cust_city, cust_state_province
from 	customers
where	cust_state_province=:b1
union 
select cust_id, cust_last_name, cust_city, cust_state_province
from 	customers
where	cust_city=:b2
minus /* to eliminate other cust_city's within a province */
select cust_id, cust_last_name, cust_city, cust_state_province
from 	customers
where	cust_city!=:b2
and	cust_state_province=:b1;

SQL_ID  dj24kq6qfxdfn, child number 0
-------------------------------------
select cust_id, cust_last_name, cust_city, cust_state_province from
customers where cust_state_province=:b1 union select cust_id,
cust_last_name, cust_city, cust_state_province from  customers where
cust_city=:b2 minus /* to eliminate other cust_city's within a province
*/ select cust_id, cust_last_name, cust_city, cust_state_province from
customers where cust_city!=:b2 and cust_state_province=:b1

Plan hash value: 3941738280

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                          |       |       |   602 (100)|          |
|   1 |  MINUS                         |                          |       |       |            |          |
|   2 |   SORT UNIQUE                  |                          |   473 | 29070 |   602  (58)| 00:00:08 |
|   3 |    UNION-ALL                   |                          |       |       |            |          |
|   4 |     TABLE ACCESS BY INDEX ROWID| CUSTOMERS                |   383 | 13022 |   257   (0)| 00:00:04 |
|*  5 |      INDEX RANGE SCAN          | CUSTOMERS_CUST_STATE_IDX |   383 |       |     2   (0)| 00:00:01 |
|   6 |     TABLE ACCESS BY INDEX ROWID| CUSTOMERS                |    90 |  3060 |    85   (0)| 00:00:02 |
|*  7 |      INDEX RANGE SCAN          | CUSTOMERS_CUST_CITY_IDX  |    90 |       |     1   (0)| 00:00:01 |
|   8 |   SORT UNIQUE                  |                          |   382 | 12988 |   258   (1)| 00:00:04 |
|*  9 |    TABLE ACCESS BY INDEX ROWID | CUSTOMERS                |   382 | 12988 |   257   (0)| 00:00:04 |
|* 10 |     INDEX RANGE SCAN           | CUSTOMERS_CUST_STATE_IDX |   383 |       |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("CUST_STATE_PROVINCE"=:B1)
   7 - access("CUST_CITY"=:B2)
   9 - filter("CUST_CITY":B2)
  10 - access("CUST_STATE_PROVINCE"=:B1)

The plan changed from Full Table Scans to Index Scans. A better alternative would be to rewrite the code, using IF ELSE, writing three different queries depending upon the user entries for :b1 and :b2. This may mean, a bit of lengthy code.

Authorization Check Failed ! Multiple Child Cursors…..

Recently, got an oppurtunity to work on a Mutex related issue causing database to halt. While the issue started from April 2013, the frequency of this increased since September 2013. This is a Banking Customer and every month, they add few Branches into the system. With the Branches, the concurrency also goes up. This increase in concurrency could be one of the reason of frequent Library Cache Mutex Issue.

In this case, the issue was found to be due to high number of Child Cursors. The top 4 queries, based on number of child cursors, had around 30k child cursors. From the v$sql_shared_cursor, the reason for these many childs was AUTH_CHECK_MISMATCH, which stands for “Authorization Check Mismatch”. Further investigation revealed this to be an Implementation issue.

The Bank has around 580 Branches. For Scalability purpose, the Application Team has implemented 290 Branches on One Database and other 280 Branches on another Database. These are named as BDB1 and BDB2. For each of the Branches, they have individual schema. Therefore, in BDB1 and BDB2 they have 290 and 280 schemas respectively. Each Schema has it’s own table, procedure, functions etc…The name of the tables, their columns etc are same as well across each of these schema’s. For example, each of the schema has a table USER_SESSION_LOGIN_DTL with same number and names of the columns. They only differ in data stored within each of these tables. Further, each schema has around 800+ tables. Therefore, a query executed against Branch 1 is executed against Branch 2, due to the textual similarity, they match in HASH Values and therefore share a common parent. The logic at the application level is that if a user connects to the application, the Branch Name is returned based on which, he or she connects to the schema (based on the branch name) and executes the application level business trasaction. Another user from another branch will open the same module but will connect to different schema. Therefore, while the queries are textually same for each of the Branches, since the underlying objects are different, we see multiple child cursors being populated.

In my view, this is an inefficient Database Design Issue. This will also have a serious manageability issue, as any change in the business logic, will have to be replicated across all the schemas. Any Optimization, i.e. creation of a New Index, will have to be replicated across each of the Schemas. This list can go on…

This is very easy to demonstrate.

## This pl/sql will create 300 Users with names as F1, F2, F3 and so on....

declare
l_statement	varchar2(100);
begin
 for i in 1..300
 loop
   l_statement:='drop User F'||i||' cascade';
   execute immediate l_statement;
   l_statement:='Create User F'||i||' identified by F'||i;
   execute immediate l_statement;
   l_statement:='grant dba to F'||i;
   execute immediate l_statement;
 end loop;
end;
/
## Next we create a Table EMP in each of these schemas. In this case, I am inserting same data into these tables.

declare
l_statement	varchar2(100);
begin
 for i in 1..300
 loop
   execute immediate 'alter session set current_schema=F'||i;
   l_statement:='create table emp as select * from scott.emp';
   execute immediate l_statement;
 end loop;
end;
/
## Next let us execute a query on each of the emp schemas (randomly). These will be 30000 queries with different Bind Variables.

declare
l_emp		number;
l_num		number;
l_sch		number;
l_statement	varchar2(1000);
l_ename		varchar2(32);
begin
 for i in 1..30000
 loop
   l_num:=round(dbms_random.value(1,14),0);
   l_sch:=round(dbms_random.value(1,300),0);
   execute immediate 'alter session set current_schema=F'||l_sch;
   select empno into l_emp from (
     select empno, row_number() over(order by empno) rno from scott.emp)
   where rno=l_num;
   l_statement := 'select /*+ vivek1 */ ename from emp where empno=:b1';
   execute immediate l_statement into l_ename using l_emp;
 end loop;
end;
/

Once the queries are executed, we can query v$sql_shared_cursor to check for the number of child cursors.

SQL> SQL> @sql_text
Enter value for 1: select /*+ vivek1
old   2: from v$sqlarea where sql_text like '&1%'
new   2: from v$sqlarea where sql_text like 'select /*+ vivek1%'

SQL_ID        SQL_TEXT                                           BUFFER_GETS EXECUTIONS VERSION_COUNT PLAN_HASH_VALUE
------------- -------------------------------------------------- ----------- ---------- ------------- ---------------
1c9w6s580jtpd select /*+ vivek1 */ ename from emp where empno=:b        5403       1801            30      3956160932
              1

SQL> select count(*) from v$sql_shared_cursor where sql_id='1c9w6s580jtpd';

  COUNT(*)
----------
      8084

SQL> select sql_id, child_number, AUTH_CHECK_MISMATCH, reason from v$sql_shared_cursor where sql_id='1c9w6s580jtpd' and rownum<=5;

SQL_ID        CHILD_NUMBER A REASON
------------- ------------ - --------------------------------------------------------------------------------
1c9w6s580jtpd           94 Y 9437Authorization Check f
1c9w6s580jtpd           95 Y 9537Authorization Check f
1c9w6s580jtpd           96 Y 9637Authorization Check f
1c9w6s580jtpd           97 Y 9737Authorization Check f
1c9w6s580jtpd           98 Y 9837Authorization Check f

For 30000 Executions, we have 8000 Child Cursors. This block was executed from a Single Session and no one else working. Imagine the impact of such an implementation in a highly concurrent environment.

Issue with Autotrace ! Criticality of Using a Right tool…..

As mentioned in my previous blog, my visit to Oracle Open World was very exciting and enriching. Every session I attended, I got to know something new and therefore thought of sharing this with my readers. I actually wanted to demonstrate this during my Sangam 2013 presentation. However, due to some official commitments, I had to pull out from the event.

In many of my presentations, especially for Developers, I had covered some concepts (along with the Demonstration) on Bind Peeking and issues with “Explain Plan” or Autotrace, as these are not bind-aware. This means, if a query contains a bind variable and if the execution plan of that query is dependant on the value provided to the bind, then the execution plan displayed (or generated) by “explain plan” / “autotrace” is not guaranted to be the same as that of runtime optimizer. Even today, I see many developers (and DBA’s as well) use either “explain plan for” or “autotrace” utilities to check for the execution plan of the queries they are working on.

I attended a very good session that was jointly presented by Maria Colgan and Jonathan Lewis. Maria mentioned about an issue with “set autot trace explain”, which has a potential of creating a (sub-optimal) plan and can cause a performance bottleneck as this sub-optimal plan can then be shared by other users.

Randolf Geist has already published a very good note on this issue and should help my readers understand this as it has been explained with good examples. However, since I was working on the demonstration to be presented during Sangam 2013, thought of sharing this here.

The demonstration is as under :


## Table Creation

drop table t1;

create table t1 as
select a.* from all_objects a, all_objects b
where rownum<=1e6;
create index t1_idx on t1(temporary);

select temporary, count(*) from t1 group by temporary;

exec dbms_stats.gather_table_stats(user,'T1',method_opt=>'for all columns size 1, for columns temporary size 100');

We created a table t1 with 1 Million Rows and an Index on Temporary Column. This Index has been created to demonstrate the difference in the plan when we query data on Temporary Column. Further, in order for the optimizer to generate different plan, we gather histogram on this column. This additional statistics will ensure that optimizer generates / computes the cardinality and the plan based on the input value.

Next, we execute two queries, one each with a Literal Value of ‘Y’ and a Bind Variable with Y passed as a value to the Bind. In both the case, the Optimizer computes nearly accurate cardinality and Index Scan Access Path. For the execution with Bind, the optimizer peeked into Bind to come out with the cardinality and Index Access Path.

select /*+ vivek_y */ OWNER, OBJECT_NAME from t1 where temporary='Y';

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
SQL_ID  7zfa7mstt63gv, child number 0
-------------------------------------
select /*+ vivek_y */ OWNER, OBJECT_NAME from t1 where temporary='Y'

Plan hash value: 546753835

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |       |       |    47 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1     |  2680 | 88440 |    47   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_IDX |  2680 |       |     7   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("TEMPORARY"='Y')

variable b1 varchar2(32);
exec :b1:='Y';

select /*+ vivek_bind_y */ OWNER, OBJECT_NAME from t1 where temporary=:b1;

SQL_ID  9c5pp1gt64s7q, child number 0
-------------------------------------
select /*+ vivek_bind_y */ OWNER, OBJECT_NAME from t1 where
temporary=:b1

Plan hash value: 546753835

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |       |       |    47 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1     |  2680 | 88440 |    47   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_IDX |  2680 |       |     7   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------

   1 - :B1 (VARCHAR2(30), CSID=178): 'Y'

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("TEMPORARY"=:B1)


Let us check for the behaviour of “explain plan for” and “autotrace” utilities. Since these are not bind-aware, the plan displayed & generated will be a Full Table Scan. The computed cardinality is based on #Rows/NDV, which is 1000000/2 = 500000. Also, note the I/O’s (consistent read) as generated by autotrace.

explain plan for
select /*+ vivek_bind_y */ OWNER, OBJECT_NAME from t1 where temporary=:b1;

SQL> @utlxpls

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   500K|    15M|  3968   (2)| 00:00:48 |
|*  1 |  TABLE ACCESS FULL| T1   |   500K|    15M|  3968   (2)| 00:00:48 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TEMPORARY"=:B1)

set autot trace
select /*+ vivek_bind_y */ OWNER, OBJECT_NAME from t1 where temporary=:b1;

2327 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   500K|    15M|  3968   (2)| 00:00:48 |
|*  1 |  TABLE ACCESS FULL| T1   |   500K|    15M|  3968   (2)| 00:00:48 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TEMPORARY"=:B1)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        841  consistent gets <--- Note the I/O
          0  physical reads
          0  redo size

set autot trace

Last, we execute the query with “set autot trace explain” and the same query with “set autot off”. In this case, while the value passed to the bind is ‘Y’, the optimizer will re-use the plan (Full Table Scan) generated by “set autot trace explain” and this is where the problem starts. With “set autot trace explain”, the plan generated is stored in the Shared Pool and is shared by the same query, if run from an application and this can be a problem. Assuming the query executed next is a part of the application query and a developer runs the exactly same query using “set autot trace explain”, the plan generated will be without bind peek and will be sub-optimal.

set autot trace explain
select /*+ with_y */ OWNER, OBJECT_NAME from t1 where temporary=:b1;


SQL> @ap

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  cbf51h7v1276h, child number 0
-------------------------------------
select /*+ with_y */ OWNER, OBJECT_NAME from t1 where temporary=:b1

Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |  3968 (100)|          |
|*  1 |  TABLE ACCESS FULL| T1   |   500K|    15M|  3968   (2)| 00:00:48 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TEMPORARY"=:B1)


18 rows selected.

set autot off
select /*+ with_y */ OWNER, OBJECT_NAME from t1 where temporary=:b1;

select plan_table_output from table(dbms_xplan.display_cursor(format=>'typical +peeked_binds'));

SQL_ID  cbf51h7v1276h, child number 0
-------------------------------------
select /*+ with_y */ OWNER, OBJECT_NAME from t1 where temporary=:b1

Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |  3967 (100)|          |
|*  1 |  TABLE ACCESS FULL| T1   |   500K|    15M|  3967   (2)| 00:00:48 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TEMPORARY"=:B1)

new   1: select sql_id, sql_text, executions, buffer_gets, elapsed_time, rows_processed from v$sqlarea where sql_id='cbf51h7v1276h'

SQL_ID        SQL_TEXT                                           EXECUTIONS BUFFER_GETS ELAPSED_TIME ROWS_PROCESSED
------------- -------------------------------------------------- ---------- ----------- ------------ --------------
cbf51h7v1276h select /*+ with_y */ OWNER, OBJECT_NAME from t1 wh          1       14393      3563761           2327
              ere temporary=:b1

The last query is exactly same to the one that was executed with “set autot trace explain” and shared the plan generated by “set autotrace explain”, which is a Full table scan and is not the one that should have been generated. Interestingly, check for the I/O’s of the last Full Table Scan plan query. The query has done 14393 Logical reads, whereas, in one of the previous execution with “autotrace on”, the I/O’s shown was 891 (and the plan says full table scan). Why is there a difference between the two I/O’s of Full Table Scans ? For an explanation, read the blog from Randolf Geist.

Oracle Open World 2013 : An Enriching Experience….

First of all, let me apologies for a long delay in posting a Blog. This was due to some critical escalations and the work pressure. Work comes first and therefore, it was very critical for me to finish off all the pending in-hand customer tasks.

This year, I had an opportunity to attend Oracle Open World 2013. Being my first visit for this Event in US, I was all excited as I was sure that this will be an opportunity to meet some of the great experts from different parts of the world and also to meet some old friends. The trip met all my expectations. During the event, while I attended many of the technical events (I will talk about these later), I met some of my old friends, like K. Gopalakrishnan (co-author of Oracle Wait Interface).

On the technical front, I attended many sessions and some of these were on Optimizer and Oracle 12c. However, the sessions that kept me glued towards the technical content and speaker presenting style was on “Top 12 Features of Oracle Database 12c” by Thomas Kyte and “Oracle Optimizer Bootcamp” presented jointly by Maria Colgan (Oracle Optimizer Group) and Jonathan Lewis. A session by Connor Mcdonald on “Optimizer Statistics : A Fresh Approach” was also well presented and the attendees enjoyed the session a lot.

While I do not want to write everything about the sessions, the take away that I want the readers to know are :

1. Multitenant Feature of Oracle 12c – This is one of the best feature that is introduced and will facilitate many of the organizations in database consolidation.
2. Adaptive Optimization – This is to ensure that a right (and optimal) execution plan is chosen (or switched) at run time based on the number of rows fetched.
3. Identity Datatype – Oracle Sequences can be attached to this data type to generate repetitive numbers.

All these years, during my presentations, I have discussed on using a right tool for optimization. Developers still use “explain plan for” or “autotrace” to check for the execution plan of their queries. I have demonstrated some of the issues around “explain plan” and this was revisited by Maria Colgan in “Oracle Optimizer Bootcamp”. Every session you attend, you are bound to learn something new and surely enough, I learned something interesting in this session. I am sure not many of my friends and readers know about this and therefore, I will include this in my Sangam 2013 presentation with a thanks to Maria.

I will be presenting a session during this year annual User Group Event “Sangam 2013”. The topic of my presentation is “Real Life Optimization Scenarios and Solutions”. Looking forward for this great event.

“Library Cache : Mutex X” ! An Interesting Optimization

One of my Customer is migrating from an Non-Oracle Database to Oracle Database and this being a critical Application, the migration is planned in a phased manner. Migration happens every fortnight (a city or set of cities at a time) and is enough to increase the load as a Vertical line (if plotted in a graph). In a simple term, the increase in the load is not linear. With such a dramatic increase, some or the other issues can be expected and this is what triggered post one such migration. The setup is a 2 Node RAC with Database version is 11.2.0.2.8. Connection Pooling is used at the Apps Layer.

During one of the migration the customer increased CPU’s from 80 to 148 on each of the Node. This was based on the recommendation from the Application Team. Immediately, post the migration the top most wait event was “library cache : mutex x”. While this was not impacting the performance but was a cause of concern as the migration was only 40% and next migration would add the load further by 20%.

We had two dramatic changes. These are : (1) The CPU’s Increased from 80 to 148, and (2) The load increased many fold because of the migration of new branches. It was critical to investigate the issue and therefore, we started monitoring the system during peak load. An interesting observation that came out was a function, which was on top, most of the time. It was BEGIN :1 := PA_MC_ENTITY_CONTEXT_FN(:2); END;. The number of executions of this function were also around 600+ per second. For more understanding on this function, we scheduled a discussion with the application team and the outcome of this discussion was very interesting.

During the discussion, it came out that the application has a functionality to use Virtual Private Database (VPD) and this function is used to set the application context, which is then used by each of the application queries to filter out the rows that do not match the context. Usually, VPD is used as a row level security, which is implemented by way of a function to generate WHERE predicate depending upon the privileges or profile of the application user. These predicates can be different for different set of users. All this is generated at run time. Interestingly, in this case, the application context is not generated dynamically based on privileges, but is set to a fix value, ‘4000’, for all the database session. Technically, there was nothing wrong in the implementation and therefore, it was not important to drill down further to relate the link between this function and mutex waits. The VPD Implementation and function code all looked fine

The interesting fact that we came to know was the way this function was called. Further investigation and discussion with the Application team revealed that for each database call, this function is called to set the application context. This means, if a session executes 5 queries, as a part of a transaction, the function to set the application context will be executed 5 times. As per them, the reason behind this is that as the Apps Layer use Connection Pool, there is no way to know whether the connection picked up by the application user is one of the existing connection from the pool or a newly spawned (once all the initial connections are exhausted). Clearly, executing a function for each database calls and setting a context on each call seemed to be a problem, as for each execution of this function, the mutex was requested in an exclusive mode to write to a memory location.

As an immediate optimization, in order to reduce the number of execution and exclusive gets, it was suggested to check for the existence of application context before setting the context. This was achieved by way of a query to first check whether the context is set to desired value. If it is, this means, it is one of existing connection and no need to set the application context. If not, this means, this is a new connection and therefore, the application context is set to ‘4000’. With this implementation, the number of executions of the function reduced drastically, thereby reducing the waits on Library Cache Mutex.

In this case, everyone suspected this to be an Oracle Database BUG. In order to eliminate or reduce the contention, it is imperative to optimize the amount of work done. Unwanted work is always an overhead, as was the case in this optimization.

Query Optimization ! An example of a better rewrite…

In January 2012, I wrote about a Query Performance Issue due to NVL(:b1,column_name) predicates in the WHERE clause. Due to these predicates, the Optimizer computed Incorrect Cardinality and came out with a Sub-Optimal Plan. In Sangam 2012, I demostrated an issue because of these predicates. This blog is again on one of the issue that a customer reported and was solved by way of a workaround. Since, this query is a seeded query, the permanent fix (suggested in this blog) is expected by way of an Application Patch.

I will demonstrate this on my SCOTT Schema and a Query on EMP table. This will be easier for me to explain as well. Based on the EMP table, the requirement is to write a report that takes 2 Inputs. These are EMPNO and DEPTNO. The users can run this report for any of the following conditions :

  1. EMPNO and DEPTNO are NOT NULL
  2. EMPNO IS NULL and DEPTNO IS NOT NULL
  3. EMPNO IS NOT NULL AND DEPTNO IS NULL
  4. EMPNO and DEPTNO are both NULL

The way Original Query is written, I assumed the Developers had above 4 requirements in mind. However, at the Customer place, the customer confirmed that only the 1st two conditions are applicable. Out of the total execution, 1st condition contributes to around 70% and 2nd contributes to 30%.

Back to our example on EMP table. With the 4 conditions in mind, any Developer would write a query as mentioned below.

select empno, ename, deptno, sal
from	emp
where	deptno = nvl(:deptno,deptno)
and	empno = nvl(:empno,empno)
order by empno;

## Execution of this query for each of the combination
## Both are NOT NULL (For a Deptno & for an Employee)
exec :deptno:=30; :empno:=7499; 

     EMPNO ENAME          DEPTNO        SAL
---------- ---------- ---------- ----------
      7499 ALLEN              30       1600

## EMPNO IS NULL (For a Deptno and all Employees)
exec :deptno:=30; :empno:=null;

     EMPNO ENAME          DEPTNO        SAL
---------- ---------- ---------- ----------
      7499 ALLEN              30       1600
      7521 WARD               30       1250
      7654 MARTIN             30       1250
      7698 BLAKE              30       2850
      7844 TURNER             30       1500
      7900 JAMES              30        950

## DEPTNO IS NULL (For an Employee)
exec :deptno:=null; :empno:=7499;

     EMPNO ENAME          DEPTNO        SAL
---------- ---------- ---------- ----------
      7499 ALLEN              30       1600

## Both are NULL (for all Deptno and all Employees)
exec :deptno:=null; :empno:=null;

     EMPNO ENAME          DEPTNO        SAL
---------- ---------- ---------- ----------
      7369 SMITH              20        800
      7499 ALLEN              30       1600
      7521 WARD               30       1250
      7566 JONES              20       2975
      7654 MARTIN             30       1250
      7698 BLAKE              30       2850
      7782 CLARK              10       2450
      7788 SCOTT              20       3000
      7839 KING               10       5000
      7844 TURNER             30       1500
      7876 ADAMS              20       1100
      7900 JAMES              30        950
      7902 FORD               20       3000
      7934 MILLER             10       1300

A single query meets the requirement for all the 4 combinations. The Developer, in this case, has done his job. However, they have not considered the fact that 2 out of 4 of the above combinations would end up doing a Full Table Scan of EMP table. In case of the customer case, since only the 1st two combinations are applicable and with 30% of the executions on combination 2, 30% of the time, the Optimizer would opt for a Full Table Scan. Before, we get into the Original case, let us check the runtime execution plan for the Query on EMP Table.

select empno, ename, deptno, sal from emp where deptno =
nvl(:deptno,deptno) and empno = nvl(:empno,empno) order by empno

Plan hash value: 3923409422

-----------------------------------------------------------------------------------------
| Id  | Operation                      | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |        |       |       |     6 (100)|          |
|   1 |  SORT ORDER BY                 |        |     7 |   322 |     6  (17)| 00:00:01 |
|   2 |   CONCATENATION                |        |       |       |            |          |
|*  3 |    FILTER                      |        |       |       |            |          |
|*  4 |     TABLE ACCESS FULL          | EMP    |     6 |   276 |     3   (0)| 00:00:01 |
|*  5 |    FILTER                      |        |       |       |            |          |
|*  6 |     TABLE ACCESS BY INDEX ROWID| EMP    |     1 |    46 |     2   (0)| 00:00:01 |
|*  7 |      INDEX UNIQUE SCAN         | PK_EMP |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(:EMPNO IS NULL)
   4 - filter(("EMPNO" IS NOT NULL AND "DEPTNO"=NVL(:DEPTNO,"DEPTNO")))
   5 - filter(:EMPNO IS NOT NULL)
   6 - filter("DEPTNO"=NVL(:DEPTNO,"DEPTNO"))
   7 - access("EMPNO"=:EMPNO)

From the predicate information, the Optimizer choice becomes very clear, which is “IF :EMPNO is NULL then FTS of EMP and IF :EMPNO is NOT NULL then Table Access by Index Rowid and a Unique Scan of PK_EMP”. This means, if a User Inputs the value for an EMPNO (irrespective of what has been passed for DEPNO), then the Optimizer will always do an Index Scan.

Now the Original Case
The Original Query and the Plan is pasted below, before we discuss on this.

SELECT TRANSACTION_ID, TRANSACTION_TYPE, TRANSACTION_DATE, TAX_TRANSACTION_ID, PARENT_TRANSACTION_TYPE, 
	CURRENCY_CONVERSION_RATE, QUANTITY, 
	DECODE(TRANSACTION_TYPE, 'RECEIVE', 1, 'RETURN TO VENDOR', -1, 'CORRECT', 
	DECODE(PARENT_TRANSACTION_TYPE, 'RECEIVE', SIGN(QUANTITY), 'RETURN TO VENDOR', 
	SIGN(QUANTITY)*-1)) QUANTITY_MULTIPLIER 
FROM 	JAI_RCV_TRANSACTIONS 
WHERE 	SHIPMENT_LINE_ID = NVL(:B2 , SHIPMENT_LINE_ID) 
AND 	TRANSACTION_ID = NVL(:B1 , TRANSACTION_ID) 
AND 	( TRANSACTION_TYPE IN ('RECEIVE', 'RETURN TO VENDOR') OR ( TRANSACTION_TYPE = 'CORRECT' AND PARENT_TRANSACTION_TYPE IN ('RECEIVE', 'RETURN TO VENDOR') ) ) 
ORDER BY TRANSACTION_ID;

Plan hash value: 4081860556

----------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                         |       |       |   621K(100)|          |
|   1 |  SORT ORDER BY                 |                         |     2 |    94 |   621K (11)| 00:02:24 |
|   2 |   CONCATENATION                |                         |       |       |            |          |
|*  3 |    FILTER                      |                         |       |       |            |          |
|*  4 |     TABLE ACCESS FULL          | JAI_RCV_TRANSACTIONS    |     1 |    47 |   621K (11)| 00:02:24 |
|*  5 |    FILTER                      |                         |       |       |            |          |
|*  6 |     TABLE ACCESS BY INDEX ROWID| JAI_RCV_TRANSACTIONS    |     1 |    47 |     3   (0)| 00:00:01 |
|*  7 |      INDEX UNIQUE SCAN         | JAI_RCV_TRANSACTIONS_PK |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(:B1 IS NULL)
   4 - filter(((INTERNAL_FUNCTION("TRANSACTION_TYPE") OR ("TRANSACTION_TYPE"='CORRECT' AND
              INTERNAL_FUNCTION("PARENT_TRANSACTION_TYPE"))) AND "SHIPMENT_LINE_ID"=NVL(:B2,"SHIPMENT_LINE_ID")
              AND "TRANSACTION_ID" IS NOT NULL))
   5 - filter(:B1 IS NOT NULL)
   6 - filter(((INTERNAL_FUNCTION("TRANSACTION_TYPE") OR ("TRANSACTION_TYPE"='CORRECT' AND
              INTERNAL_FUNCTION("PARENT_TRANSACTION_TYPE"))) AND
              "SHIPMENT_LINE_ID"=NVL(:B2,"SHIPMENT_LINE_ID")))
   7 - access("TRANSACTION_ID"=:B1)

As mentioned earlier, in case of the Customer case, only the 1st two combinations are applicable. These are (columns from Original Query) :

  1. TRANSACTION_ID and SHIPMENT_LINE_ID are NOT NULL
  2. TRANSACTION_ID IS NULL and SHIPMENT_LINE_ID IS NOT NULL

There is a Unique Index on TRANSACTION_ID and another Unique Index on (SHIPMENT_LINE_ID,TRANSACTION_ID). This query is a part of a Transaction and for the cases, where the value to :B1 (Transaction_Id) is null, the transaction takes around 20 minutes to complete. The table has around 9 Million Rows and does a Full Table Scan for such cases.

Since the customer mentioned that in their case, value for TRANSACTION_ID column can only be passed as NULL, an option of changing “:B1 is NULL” to “:B2 is NULL” (see PREDICATE INFORMATION) seemed to be a better choice. Remember, :B1 is for TRANSACTION_ID and :B2 is for SHIPMENT_LINE_ID. This change will check, if :B2 is NULL then FTS on the Table and if :B2 is NOT NULL then an Index Scan on SHIPMENT_LINE_ID. The Uniqueness of TRANSACTION_ID is more than SHIPMENT_LINE_ID, and therefore, optimizer picks TRANSACTION_ID as more cheaper and opts for an Index Scan on this column. Below are the column statistics for the two columns.

COLUMN_NAME                    NUM_DISTINCT  NUM_NULLS    DENSITY HISTOGRAM
------------------------------ ------------ ---------- ---------- ---------------
SHIPMENT_LINE_ID                    4351257          0 2.2982E-07 NONE
TRANSACTION_ID                      9092033          0 1.0999E-07 NONE

As a temporary workaround, the column statistics for the two columns were swapped. Before doing that, it was critically important to actually check the number of queries with TRANSACTION_ID column in the WHERE predicate. This is ensure that this temporary fix should not impact other queries. Once it was confirmed that this change will not impact, the column stats were swapped. This swapping was intended to see below mentioned change in the plan :

Original – if :B1 is NULL then FTS. If :B1 is NOT NULL then Unique Index Scan on TRANSACTION_ID
Changed – if :B2 is NULL then FTS. If :B2 is NOT NULL then Range Scan on SHIPMENT_LINE_ID

Since the value to :B2 is never passed as NULL, this change will also make sure this 9 Million Row table is never Full Scanned. The swapping of statistics worked and the plan changed for better. Some of the transaction that took around 20 Minutes started completing in a second.

exec dbms_stats.set_column_stats(ownname=>'JA',tabname=>'JAI_RCV_TRANSACTIONS',colname=>'TRANSACTION_ID',DISTCNT=>4351257);
exec dbms_stats.set_column_stats(ownname=>'JA',tabname=>'JAI_RCV_TRANSACTIONS',colname=>'SHIPMENT_LINE_ID',DISTCNT=>9092033);

Plan hash value: 3580471929

----------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                           |       |       |   621K(100)|          |
|   1 |  SORT ORDER BY                 |                           |     2 |    94 |   621K (11)| 00:02:24 |
|   2 |   CONCATENATION                |                           |       |       |            |          |
|*  3 |    FILTER                      |                           |       |       |            |          |
|*  4 |     TABLE ACCESS FULL          | JAI_RCV_TRANSACTIONS      |     1 |    47 |   621K (11)| 00:02:24 |
|*  5 |    FILTER                      |                           |       |       |            |          |
|*  6 |     TABLE ACCESS BY INDEX ROWID| JAI_RCV_TRANSACTIONS      |     1 |    47 |     4   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN          | JA_IN_RCV_TRANSACTIONS_N5 |     1 |       |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(:B2 IS NULL)
   4 - filter(((INTERNAL_FUNCTION("TRANSACTION_TYPE") OR ("TRANSACTION_TYPE"='CORRECT' AND
              INTERNAL_FUNCTION("PARENT_TRANSACTION_TYPE"))) AND "TRANSACTION_ID"=NVL(:B1,"TRANSACTION_ID")
              "SHIPMENT_LINE_ID" IS NOT NULL))
   5 - filter(:B2 IS NOT NULL)
   6 - filter(((INTERNAL_FUNCTION("TRANSACTION_TYPE") OR ("TRANSACTION_TYPE"='CORRECT' AND
              INTERNAL_FUNCTION("PARENT_TRANSACTION_TYPE"))) AND "TRANSACTION_ID"=NVL(:B1,"TRANSACTION_ID")
   7 - access("SHIPMENT_LINE_ID"=:B2)

While this was a temporary workaround to fix the problem, a better strategy would have been to write a query that take the advantage of the available indexes and works optimally for any of the 4 combinations mentioned above. A better query is as under :

select * from (
SELECT /*+ WITH BOTH BINDS AS NOT NULL */ TRANSACTION_ID, TRANSACTION_TYPE, TRANSACTION_DATE, TAX_TRANSACTION_ID, PARENT_TRANSACTION_TYPE, 
	CURRENCY_CONVERSION_RATE, QUANTITY, 
	DECODE(TRANSACTION_TYPE, 'RECEIVE', 1, 'RETURN TO VENDOR', -1, 'CORRECT', 
	DECODE(PARENT_TRANSACTION_TYPE, 'RECEIVE', SIGN(QUANTITY), 'RETURN TO VENDOR', 
	SIGN(QUANTITY)*-1)) QUANTITY_MULTIPLIER 
FROM 	JAI_RCV_TRANSACTIONS 
WHERE 	SHIPMENT_LINE_ID = :B2
AND 	TRANSACTION_ID = :B1
AND 	( TRANSACTION_TYPE IN ('RECEIVE', 'RETURN TO VENDOR') OR ( TRANSACTION_TYPE = 'CORRECT' AND PARENT_TRANSACTION_TYPE IN ('RECEIVE', 'RETURN TO VENDOR') ) ) 
union all
SELECT /*+ WITH B1 AS NULL */ TRANSACTION_ID, TRANSACTION_TYPE, TRANSACTION_DATE, TAX_TRANSACTION_ID, PARENT_TRANSACTION_TYPE, 
	CURRENCY_CONVERSION_RATE, QUANTITY, 
	DECODE(TRANSACTION_TYPE, 'RECEIVE', 1, 'RETURN TO VENDOR', -1, 'CORRECT', 
	DECODE(PARENT_TRANSACTION_TYPE, 'RECEIVE', SIGN(QUANTITY), 'RETURN TO VENDOR', 
	SIGN(QUANTITY)*-1)) QUANTITY_MULTIPLIER 
FROM 	JAI_RCV_TRANSACTIONS 
WHERE 	SHIPMENT_LINE_ID = :B2 AND :B1 is NULL
AND 	( TRANSACTION_TYPE IN ('RECEIVE', 'RETURN TO VENDOR') OR ( TRANSACTION_TYPE = 'CORRECT' AND PARENT_TRANSACTION_TYPE IN ('RECEIVE', 'RETURN TO VENDOR') ) ) 
union all
SELECT /*+ WITH B2 AS NULL */ TRANSACTION_ID, TRANSACTION_TYPE, TRANSACTION_DATE, TAX_TRANSACTION_ID, PARENT_TRANSACTION_TYPE, 
	CURRENCY_CONVERSION_RATE, QUANTITY, 
	DECODE(TRANSACTION_TYPE, 'RECEIVE', 1, 'RETURN TO VENDOR', -1, 'CORRECT', 
	DECODE(PARENT_TRANSACTION_TYPE, 'RECEIVE', SIGN(QUANTITY), 'RETURN TO VENDOR', 
	SIGN(QUANTITY)*-1)) QUANTITY_MULTIPLIER 
FROM 	JAI_RCV_TRANSACTIONS 
WHERE 	TRANSACTION_ID = :B1 AND :B2 IS NULL
AND 	( TRANSACTION_TYPE IN ('RECEIVE', 'RETURN TO VENDOR') OR ( TRANSACTION_TYPE = 'CORRECT' AND PARENT_TRANSACTION_TYPE IN ('RECEIVE', 'RETURN TO VENDOR') ) )
UNION ALL
SELECT /*+ WITH BOTH BINDS AS NULL */ TRANSACTION_ID, TRANSACTION_TYPE, TRANSACTION_DATE, TAX_TRANSACTION_ID, PARENT_TRANSACTION_TYPE, 
	CURRENCY_CONVERSION_RATE, QUANTITY, 
	DECODE(TRANSACTION_TYPE, 'RECEIVE', 1, 'RETURN TO VENDOR', -1, 'CORRECT', 
	DECODE(PARENT_TRANSACTION_TYPE, 'RECEIVE', SIGN(QUANTITY), 'RETURN TO VENDOR', 
	SIGN(QUANTITY)*-1)) QUANTITY_MULTIPLIER 
FROM 	JAI_RCV_TRANSACTIONS 
WHERE 	:B1 IS NULL AND :B2 IS NULL
AND 	( TRANSACTION_TYPE IN ('RECEIVE', 'RETURN TO VENDOR') OR ( TRANSACTION_TYPE = 'CORRECT' AND PARENT_TRANSACTION_TYPE IN ('RECEIVE', 'RETURN TO VENDOR') ) ) )
ORDER BY TRANSACTION_ID;

The benefit of this modified query is :<

  1. If both BINDS are NOT NULL, use a Composite Unique Index on SHIPMENT_LINE_ID, TRANSACTION_ID
  2. If BIND :B1 on TRANSACTION_ID is NULL then a Range Scan on SHIPMENT_LINE_ID
  3. If Bind :B2 on SHIPMENT_LINE_ID is NULL then a Unique Scan on TRANSACTION_ID
  4. If Both Binds are NULL, then a Full Table Scan (this is unavoidable).

dbms_parallel_execute ! still unused feature of Oracle Database 11g

I still find many developers unaware of dbms_parallel_execute, which is a better and efficient way of distributing the data across multiple chunks. Recently, I came across one such implementation, where the Developers introduced their own way of creating multiple chunks using ROWNUM pseudo column and then distributing these records into required streams. In this case, the Developers intended to spawn 64 Streams. Further, they created an Index on the some of the columns to Update the records that were processed. The drawback of this implementation was :

  1. Block Level Contention for Tables, as each Streams can request and process Blocks from other Streams
  2. Index Block Contention, as multiple Streams would update same Block. A Better way would be to select ROWID and UPDATE based on ROWID.

This Implemention required around 50 Million Rows to be processed in parallel streams of 64. Due to the issues mentioned above, the process took abnormal time and was kept on hold for production movement. Our advice was sought on the Optimization and we could see that the Developers still unware of a fantastic and readymade approach, called dbms_parallel_execute. I had one of my previous Blog on the same topic and wanted them to go through it for some examples. We also explained them the benefit of this approach over their current implementation. They readily accepted this and thought of giving it a try before moving it on production. I was expecting some queries from the Developers, as this was an overall new concept for them and Yes, the first concern raised was over the CHUNK_SIZE=>64 value in dbms_parallel_execute.create_chunks_by_rowid proceedure. The Developers expected this procedure to create 64 Chunks of the entire table, which can then be processed by 64 parallel jobs. In this case, since the table was huge, it created around 5000+ Chunks. This was a bit surprising for them. Further, even if it created more number of chunks, the Challenge here was building of an additional logic to run only 64 Streams at any given point in time. Further, the logic should also take care of the finished jobs and to spawn a new one automatically.

First, let us address and explain the CHUNK_SIZE argument of create_chunks_by_rowid procedure. This value works with BY_ROWS argument. If BY_ROWS is specified as TRUE, the number of chunks are created based on the approximate number of rows divided into each stream. For example, in case of CHUNK_SIZE=>64 and BY_ROWS=>TRUE, each stream will have approximately 64 or less rows. With CHUNK_SIZE=>64 and BY_ROWS=>FALSE, the division is based on BLOCKS, with each stream containing approximately 64 or less blocks. Further, each stream will not span across multiple extents and therefore, if a table has smaller extents, you may see a high count of CHUNKS created (as 5000+ in the case of above mentioned example).

The Start & End ROWID of each of the Chunks are stored in user_parallel_execute_chunks view. Since this is a SYS Object and we do not have much control over this. I usually create my own table, where I Insert the data from USER_PARALLEL_EXECUTE_CHUNKS. Once the data into our own table is Inserted, it is easier to update our table with additional information required for analysis. Let us walk through a simple example.

## Create our own table to track the status of each job.

create table process_status (
process_name            varchar2(30),
job_id                  number,
s_rowid                 rowid,
e_rowid                 rowid,
start_time              date,
end_time                date,
processed_rows          number,
completed_yn            varchar2(1),
status                  varchar2(1000));

## Create a Parallel Job Task
exec  dbms_parallel_execute.create_task('VIVEK_JOB');

## Create Chunks, with the approximate Blocks / Rows (if BY_ROWS=>FALSE, it will distribute on Blocks else on Rows)
exec  dbms_parallel_execute.create_chunks_by_rowid('VIVEK_JOB','FLEX','CH_OD_AGREEMENT',false, 100);

## Insert the Chunks in our process_status table
insert into process_status(process_name, job_id, s_rowid, e_rowid)
select 'VIVEK_JOB', chunk_id, start_rowid, end_rowid from user_parallel_execute_chunks
where task_name='VIVEK_JOB' order by 1;
commit;

## Our Procedure that will run in parallel. We need to pass start_rowid and end_rowid as the Input parameters

create or replace procedure FLEX.vivek_proc (st_rowid in rowid, ed_rowid in rowid) as
l_cnt      number;
begin
update FLEX.process_status set start_time=sysdate, status='RUNNING' 
 where s_rowid=st_rowid;  
 commit;  
 select count(*) into l_cnt from FLEX.CH_OD_AGREEMENT 
 where rowid between st_rowid and ed_rowid;  
 update FLEX.process_status set status='COMPLETED', end_time=sysdate, processed_rows=l_cnt 
 where s_rowid=st_rowid;  
 commit; 
end; 
/

VIVEK_PROC is the procedure that is required to be run in parallel streams. The Input Parameters are START & END ROWID. I have also modified the procedure to update the PROCESS_STATUS table, with the start_time, status and at the completion of the main business logic, the end_time, Processed_rows and status. Further this data can be maintained as a History (by inserting into history table before the next run). Due to this flexibility, I choose creation of PROCESS_STATUS table.

The total number of chunks created by dbms_parallel_execute for my test table was around 3500 and I wanted to spawn only 64 streams. Many Developers chose restricting job_queue_processes to 64 and schedule all the 3500 jobs together. Restricting job_queue_processes to 64 will ensure only 64 jobs running. This is achievable, but would be a very crude way of doing so. Further, this would only work if there are no other jobs scheduled during this time, else it will also be queued. A best option is again using dbms_parallel_execute. There is a procedure called RUN_TASK and an argument PARALLEL_LEVEL limits the number of parallel jobs to be run at any given point of time. No additional logic is required to check for the completion of a job and scheduling of a new job. Oracle will do it for us.

## Run the Task, with number of parallel job queue processes. This is controlled using PARALLEL_LEVEL

begin
      dbms_parallel_execute.run_task
      ( task_name      => 'VIVEK_JOB',
        sql_stmt       => 'begin FLEX.vivek_proc( :start_id, :end_id ); end;',
        language_flag  => DBMS_SQL.NATIVE,
        parallel_level => 64 );
      commit;
end;
/
## Once the Jobs are done, drop the Job Task
exec dbms_parallel_execute.drop_task('VIVEK_JOB');

DROP_TASK will remove the entries from USER_PARALLEL_EXECUTE_CHUNKS table.

OTN Yathra ! A Six City Tour

OTN Yathra starts from 16th February 2013. This is a collaborative effort between OTN & AIOUG. This is a 6 City Tour, the first destination being Delhi. I am presenting in Delhi, Mumbai and Hyderabad. I am excited to present my first ever presentation in Delhi. For the details and registration, please visit :

http://aioug.org

Regards
Vivek

OEM 12c ! A must have book.

My friend and former ACE Director Porus Homi Havewala’s new book on
Oracle Enterprise Manager 12c Cloud Control, is now available, this is
the First published EM 12c Cloud Control book in the world.

OEM 12c simplifies many of the administrative tasks and also is helpful in monitoring the critical aspect of a Production system. The book should help you in understanding and learning about the capabilities of Enterprise Manager 12c. Electronic copies are available too.

http://www.packtpub.com/oracle-enterprise-manager-12c-cloud-control/book

Oracle Enterprise Manager 12c Cloud Control: Managing Data Center Chaos

Follow

Get every new post delivered to your Inbox.

Join 138 other followers

%d bloggers like this: