Library Cache Latch Contention due to Multiple Version Count – Day 2 of AIOUG

Day 2 of “All India Oracle User Group” was continuation of where I left on Day 1 – Library Cache and Multiple Version Count. On Day 2, I had a real life example on the issue that I presented. On Day 1, the conceptual knowledge of Library Cache, Parent & Child Cursors and some of the fundamental, but not to be ignored, causes of Multiple Child Cursors was shared.

One of the major cause of Multiple Version for an SQL is CURSOR_SHARING. When cursor_sharing is set to SIMILAR, optimizer checks for either of these :

Is the Operator NEITHER of ‘=’ or ‘!=’

OR

Are there Column Histograms present on the Columns

When either of these are true, then, irrespective of whether the execution plan of the new child cursor is already stored, a new Child cursor is created. The only exception is, if the runtime value of the bind variable matches that of an already stored child cursor. We will see this with an example.

Before we begin, as a prerequisite, we will create our test case that will be used for both the cases mentioned above. The data in this table is populated in a skewed manner and we create an Index on the skewed column.

PREREQUISITE

exec dbms_random.seed(0);

drop table test_hist;

create table test_hist as 
with n as
(select /*+ materialize */ round(dbms_random.normal,0) n1 from all_objects where rownum<=10000)
select decode(sign(n1),-1,n1+9,n1) empid,
       round(dbms_random.value(1000,2000),0) sal,
       case when mod(round(dbms_random.value(101,110),0),2)=0 then 'M' else 'F' end gender
from	n
where rownum<=10000;

create index test_hist_idx on test_hist(empid);

select empid, cnt, round(cnt/final_total*100,2) percentage from (
select empid, cnt, sum(cnt) over() final_total
from (
select empid, count(*) over(partition by empid) cnt, 
row_number() over (partition by empid order by empid) rno 
from test_hist)
where rno=1);

     EMPID        CNT PERCENTAGE
---------- ---------- ----------
         0       3937      39.37
         1       2289      22.89
         2        630        6.3
         3         55        .55
         4          4        .04
         5          4        .04
         6         64        .64
         7        646       6.46
         8       2371      23.71

9 rows selected.

Once the test case is ready, we will walk through the two issues, mentioned above, introduced by Cursor_Sharing set to SIMILAR. For this purpose, we will generate statistics in two different ways: With & Without Histograms.

TEST CASE 1 : With Histogram and Equality Predicates

Gather Stats on this table with method_opt as ‘FOR COLUMNS EMPID SIZE 74’. With this, we generate Frequency Histogram on Empid column. Since the column has only 9 distinct values, creating histogram will generate different execution plans for the values with less data.

Combination of Cursor_Sharing to SIMILAR and unwanted Histograms can cause a major performance bottleneck, with contention on Library Cache related waits. Therefore, it is always recommended to generate Histograms only when required. Oracle 10g changed the default value of method_opt from ‘FOR ALL COLUMNS SIZE 1’ to ‘FOR ALL COLUMNS SIZE AUTO’ and with this change, any database with CURSOR_SHARING set to SIMILAR, would face a severe latch contention issue after upgrading to 10g, as unwanted histograms are generated on columns with large distinct values. As you will see in this test case, these unwanted histograms introduce more child cursors, further causing a hash chain to grow longer. We all know that latches are to be held for brief duration, but, if number of child cursors grow, inspecting a hash chain for a matching cursor can take significant amount of time preventing others to get the latch.

alter system set cursor_sharing=similar scope=memory;
select /*+ VIVEK */ sal from test_hist where empid=0;
select /*+ VIVEK */ sal from test_hist where empid=2;
select /*+ VIVEK */ sal from test_hist where empid=4;
select /*+ VIVEK */ sal from test_hist where empid=7;
select /*+ VIVEK */ sal from test_hist where empid=8;

Set CURSOR_SHARING to SIMILAR and execute 5 queries with different values passed to empid. Once these are run, check against v$sqlarea, v$sql and finally, v$sql_shared_cursor to get the cause of multiple versions. The output would be similar to that of mine, except for the plan_hash_value. In my case, for all the cases above, the optimizer opted for Full Table Scan. In your case, the plans would differ. Even though the execution plans may or may not match, it is irrelevant for this discussion. After execution of the 5 queries, the output would look like:

SQL> select sql_id, sql_text, executions, child_latch, version_count, plan_hash_value, hash_value fr
om v$sqlarea where sql_text like 'select /*+ VIVEK%';

SQL_ID        SQL_TEXT                       EXECUTIONS CHILD_LATCH VERSION_COUNT PLAN_HASH_VALUE HASH_VALUE
------------- ------------------------------ ---------- ----------- ------------- --------------- --
baz378z2m5q42 select /*+ VIVEK */ sal from t          5           2             5      3529380458 3308443778
              est_hist where empid=:"SYS_B_0
              "


Elapsed: 00:00:00.10
SQL>select sql_id, sql_text, executions, child_latch, plan_hash_value, child_number, child_address from v$sql where sql_text like 'select /*+ VIVEK%';

SQL_ID        SQL_TEXT                       EXECUTIONS CHILD_LATCH PLAN_HASH_VALUE CHILD_NUMBER CHILD_AD
------------- ------------------------------ ---------- ----------- --------------- ------------ ---
baz378z2m5q42 select /*+ VIVEK */ sal from t          1           2      3529380458            0 699F048C
              est_hist where empid=:"SYS_B_0
              "

baz378z2m5q42 select /*+ VIVEK */ sal from t          1           2      3529380458            1 68A65B38
              est_hist where empid=:"SYS_B_0
              "

baz378z2m5q42 select /*+ VIVEK */ sal from t          1           2      3529380458            2 68A47B00
              est_hist where empid=:"SYS_B_0
              "

baz378z2m5q42 select /*+ VIVEK */ sal from t          1           2      3529380458            3 69519264
              est_hist where empid=:"SYS_B_0
              "

baz378z2m5q42 select /*+ VIVEK */ sal from t          1           2      3529380458            4 6A3B1AE0
              est_hist where empid=:"SYS_B_0
              "

Elapsed: 00:00:00.25

SQL> select * from v$sql_shared_cursor where sql_id='baz378z2m5q42';

SQL_ID        ADDRESS  CHILD_AD CHILD_NUMBER U S O O S L S E B P I S T A B D L T R I I R L I O S M U 
------------- -------- -------- ------------ - - - - - - - - - - - - - - - - - - - - - - - - - - - -
M B M R O P M F L
- - - - - - - - -
baz378z2m5q42 684B2D44 699F048C            0 N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N 
N N N N N N N N N

baz378z2m5q42 684B2D44 68A65B38            1 N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N 
N N N N N N N N N

baz378z2m5q42 684B2D44 68A47B00            2 N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N 
N N N N N N N N N

baz378z2m5q42 684B2D44 69519264            3 N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N 
N N N N N N N N N

baz378z2m5q42 684B2D44 6A3B1AE0            4 N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N 
N N N N N N N N N

Surprisingly, v$sql_shared_cursor does not provide a justification for the cause of multiple child cursors. If you again run a query with EMPID set to any of the values passed earlier (eg empid=0), the version_count will not be incremented. This means, a child cursor, in this case, will only be shared if a value passed to the FILTER Predicate matches any of the bind value previously executed.

TEST CASE 2 : Without Histogram and Range Predicates

For this test case, regather the statistics on the TEST_HIST table with method_opt changed to ‘FOR ALL COLUMNS SIZE 1’, which means, no histograms. Without Histograms, multiple child cursors will be created for any queries that does not contain EQUALITY (=) or NOT EQUAL (!=) predicates. For demonstration, make sure the histograms are removed and then execute following queries:

select /*+ VIVEK */ empid, sal from test_hist where sal between 1000 and 1100;

select /*+ VIVEK */ empid, sal from test_hist where sal between 1200 and 1500;

select /*+ VIVEK */ empid, sal from test_hist where sal between 1700 and 1800;

select /*+ VIVEK */ empid, sal from test_hist where sal between 1500 and 2000;

The output should look like:

SQL> select sql_id, sql_text, executions, child_latch, version_count, plan_hash_value, hash_value fr
om v$sqlarea where sql_text like 'select /*+ VIVEK%';

SQL_ID        SQL_TEXT                       EXECUTIONS CHILD_LATCH VERSION_COUNT PLAN_HASH_VALUE HASH_VALUE
------------- ------------------------------ ---------- ----------- ------------- --------------- --
2xk8mywyq0z69 select /*+ VIVEK */ empid, sal          4           2             4      1054764382 1029733577
               from test_hist where sal betw
              een :"SYS_B_0" and :"SYS_B_1"


Elapsed: 00:00:00.04
SQL> select sql_id, sql_text, executions, child_latch, plan_hash_value, child_number, child_address 
from v$sql where sql_text like 'select /*+ VIVEK%';

SQL_ID        SQL_TEXT                       EXECUTIONS CHILD_LATCH PLAN_HASH_VALUE CHILD_NUMBER CHILD_AD
------------- ------------------------------ ---------- ----------- --------------- ------------ ---
2xk8mywyq0z69 select /*+ VIVEK */ empid, sal          1           2      1054764382            0 69BCD9A0
               from test_hist where sal betw
              een :"SYS_B_0" and :"SYS_B_1"

2xk8mywyq0z69 select /*+ VIVEK */ empid, sal          1           2      1054764382            1 69FF769C
               from test_hist where sal betw
              een :"SYS_B_0" and :"SYS_B_1"

2xk8mywyq0z69 select /*+ VIVEK */ empid, sal          1           2      1054764382            2 693BED94
               from test_hist where sal betw
              een :"SYS_B_0" and :"SYS_B_1"

2xk8mywyq0z69 select /*+ VIVEK */ empid, sal          1           2      1054764382            3 68F50694
               from test_hist where sal betw
              een :"SYS_B_0" and :"SYS_B_1"

Again, in this case, if you re-execute a query with sal between 1200 and 1500, the version_count will not be incremented as a shared version of a query with same bind values, is already stored in the library cache. In both the cases, v$sql_shared_cursor does not provide the justification for the cause of multiple child cursors and therefore, it is very difficult to diagnose this.

These two test cases should help you understand the side effect of setting Cursor_Sharing parameter to SIMILAR. With this setting, even if you take utmost care about not generating unwanted histograms, any range predicate queries can cause Library Cache contention issue. I am sure that while Histograms are avoidable, Range Based Queries are unaviodable. 🙂

A Child Library Cache Latch protects multiple objects (cursors). This means, two or more entirely different queries can be hashed to a same child latch. Time Consuming match for a query with several child cursors can also cause other sessions to wait on Library Cache Latches for other queries, protected by this same latch, with 1 child cursor. Therefore, Library Cache Latch contention are also called as False Contention and utmost care has to be taken while diagnosing contention on these latches.

Real Life Example

During the AIOUG event and after the completion of these test cases, I presented on an Real Life Example. One of my customer complained for severe performance issues and one of the cause of this slow performance was identified as Contention on Library Cache Latches. These are some of the statistics, before and after eliminating Library Cache Latch Contention.

BEFORE

Optimizing Performance Part 21

ISSUE

The combination of Database Version 10.2.0.4 and Cursor_Sharing=SIMILAR was enough to predict the issue which was due to multiple version count. The table below shows the list of top 20 Queries, with Executions and their Version Counts.

select * from (
select sql_id, version_count, executions, child_latch from v$sqlarea
order by 2 desc)
where rownum<=20;

SQL_ID        VERSION_COUNT EXECUTIONS CHILD_LATCH
------------- ------------- ---------- -----------
5w93sms4xr70b         25775      59303          42
4upsrgs9r0z8h         19982      41372          10
cpk0xaggwc9q7         15316      41929          56
fx0662kfns5mj         12060      23058          17
c8brg2z6ytxjk          8991      19637          21
6t87d75m0hby0          8863      46765          37
cnsttwq7y79tc          8549      42953           7
76hw2b3ujar2z          8290        988          61
dh8yhs5pjh0dt          7670      10890          28
8xjwbmfsn0s2y          7057        996          53
dfp62t3qy92fm          6176     192357          26
2d1p0dmyaapv3          4718      16237          64
0a8bvn7zn8u26          4026       9192          39
a2qn47hh1ypqk          3982       7017           8
c1dq1hg17y3xg          3520       2255          62
08q94ja2kmqs3          3304       6143           2
4gm2wn99nk1mk          3190       6566          15
6s2cp8hk4bych          2907      48162          20
8z1b906qc3tnx          2752       6060          47
bg5gftj0r3s4m          2630      10283          52


20 rows selected.

The text of 1st Query from the above list is as under :

SQL_ID : 5w93sms4xr70b
Module : GI6_03
Sharable_mem : 337,369,920  

SELECT :"SYS_B_0" FROM GIT_LOP_COVER 
WHERE 	ROWNUM < :"SYS_B_1" 
AND 	LOP_SYS_ID = :"SYS_B_2" 
AND 	END_N_IDX = :"SYS_B_3"
AND 	END_S_NO  = :"SYS_B_4" 
AND 	ATTACH_L_NO = :"SYS_B_5" 
 
OWNER             PAR   NUM_ROWS     BLOCKS LAST_ANALYZED        GLO T
----------------- --- ---------- ---------- -------------------- --- -
LOP_PROD          NO    86222715    4334060 02-Sep-2008 10:05:26 YES N
 
 
COLUMN_NAME       NUM_DISTINCT  NUM_NULLS    DENSITY HISTOGRAM
----------------- ------------ ---------- ---------- ---------------
LOP_SYS_ID            12529993          0 1.8931E-07 HEIGHT BALANCED 
END_N_IDX                   23          0 5.8944E-09 FREQUENCY
END_S_NO                     1          0 5.8944E-09 FREQUENCY
ATTACH_L_NO                  3   11880179 6.8543E-09 FREQUENCY

The Query uses Equality Predicates and we have already seen the issue due to Histograms and Cursor_Sharing SIMILAR. To validate this further, check for the column statistics, as shown above, for the queried table and it uses Histograms on the columns used in this query.

SOLUTION

To resolve this latch contention issues, the default method_opt was changed to ‘FOR ALL COLUMNS SIZE 1’ and the statistics were recollected. This gave some relief, but the version count for some of the queries were still high due to Range Predicates. Therefore, CURSOR_SHARING was set to FORCE. After this change, the version_count dropped significantly and the benefit can be seen from the statistics pasted below.

Optimizing Performance Part 24

With Multiple Version Counts, Hard Parses were more than 100 per second. After changing from SIMILAR to FORCE, it came down to 60. Further implementation of Bind Variables, in place of Literals, brought this down to less than 10.

Production Statistics, post optimization


Optimizing Performance Part 25

In the next blog, I would write on another test case alongwith a real life example on the issue due to Improper Implementation of an Oracle feature that caused a severe Row Cache Latch Contention. This discussion was also a part of my Day 2 presentation at AIOUG.

Advertisements

About Vivek Sharma
I am an Oracle Professional from Mumbai (India). I blog on the issues that I feel is Interesting for my readers. Some of these are my real life examples, which I hope, you would find interesting. Comments are always a welcome. The Technical Observations & Views here are my own and not necessarily those of Oracle or its affiliates. These are purely based on my understandings, learnings and resolutions of various customer issues.

2 Responses to Library Cache Latch Contention due to Multiple Version Count – Day 2 of AIOUG

  1. sheshu says:

    I did similar testing on 9i for the case of range predicates and no histograms. Oracle does not generate many versions if u collect stats on all the indexes columns only.

    Like

  2. Anand says:

    Hi Vivek,

    I am having a 2-node RAC database.The database is on 10g (10.2.0.3) and stats are gathered with histogram( default method_opt.Cursor_sharing parameter is set to similar.Checking for the version_count, most of the select queries with equality predicate are showing very high version_count value (somewhere in 1000s).The database is having performance issue, though library cache latch contention is not much,neither is top 5 waits.During the peaks hours gc buffer busy is in top 5 events.

    Will testing the application with method_opt => ‘FOR ALL COLUMNS SIZE 1’ help in this case??

    Regards,
    Anand

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s