The Power of System Statistics / CPU Costing

An Example from my Blog on “DBA – Developer Relationship ! A fine example of collaboration”

On 20th April 2009, I wrote a blog on DBA-Developer Relationship, and the collaborative effort to optimize the performance of a Business Critical Batch Process. In the blog, I mentioned about the database version, which was 8.1.7.4. For an “All India Oracle User Group”, I was writing an Interesting article for their AIOUG Journal “Oracle Connect” and thought of sharing this experience to the User Group Forum. The article, labeled as “Efficient Indexing Strategy”, mostly concentrated on the issues due to the Number of rows, fetched by an Index and finally discarded during the Table fetch.

As an example, assuming a simple query on a table EMP and an Index on ENAME (no index on deptno). If I execute a Query to check for ENAME=’VIVEK SHARMA’ in deptno=10, the optimizer might opt for an Index on ENAME. Further assuming, since VIVEK SHARMA is a very common name, the Index Scan fetches around 30 rows, but there is only 1 VIVEK SHARMA in deptno 10. Therefore, when the additional predicate is applied during Table Scan, 29 rows from the Index Fetch will be discarded and these 29 rows are the throw-away rows. Efficient Indexing eliminates these throw aways, as these impact the performance of a Production System.

While writing this Article for AIOUG, I came across an Interesting and Amazing Optimization, which was not actually available in Oracle 8i and 9i (without system statistics). With System Statistics, the way Indexes are scanned, have changed and if you create a test case and run this across each of these versions, including Oracle 10g, you would notice this change.

Again, since my previous blog is inaccessible, let me post the content of that blog here and then we would discuss about the change implemented after introduction of system statistics.

The content of my Blog Posted on 20th April 2009 starts here.

Thomas Kyte wrote a very good paragraph on DBA – Developer Relationship in his book “Expert One-on-One”. He truly says that this relationship should be healthy. During my AIOUG presentation, I had a slide on this topic and I went on explaining about the importance of this relationship. I have come across many performance issues, where, the issues come up because Developers work in isolation, without bothering much about the performance of their application queries and this indirectly means, leaving the job of optimization for the DBA’s. This impacts the scalability of an entire production system. This also means that the application queries are revisited for optimization, based on the recommendations by the DBA’s.

Let me present a real life example, wherein, a discussion with the Developer helped me optimizing performance of a critical process. Recently, I was working on one Severity 1 performance issue. The problem summary is overall slowness at the database level and the main contributor to this performance issue being, Hard Parses and Huge Resource Intensive Queries. Surprisingly, The database version is 8.1.7.4.

While optimizing the performance of a critical process, we came across a top query. The entire process use to take almost 11 hours, wherein, this problematic step takes 8 hours or sometimes fails with ora-1555. Therefore, it was evident that optimizing this step would bring down the completion time of the entire process. This particular step had three different queries and based on the inputs from the developers, and 10046 trace files, the contribution of each of these were 7 hours, 25 minutes and 25 minutes respectively. Therefore, it is clearly the first query that takes significant amount of time.

Query Text

SELECT opnamt_gl, entdate, argdate
FROM ORDER_ALL
WHERE customer_id = :b1
AND ((invtype = 5 AND status IN('IN','CM'))
OR (status IN('FC','CO')))
AND opnamt_gl != 0; ## This last predicate is opnamt_gl not equal to 0;

This query executes in a loop and processes almost 1 Million customers. There is a Composite Index on (customer_id, status, invtype) and this query takes almost 2 seconds to process a customer. Based on this data, the next two queries are executed. The table ORDER_ALL is 16 GB in size. Since the last 2 columns of the Index is used as OR or IN predicate, the index scan is only done based on the customer_id. The Query fetches only 1 row for each customer.

A Casual Discussion that helped Optimizing this code

While we were discussing this with the developer of this process, he mentioned that this query takes 2 seconds, whereas, on another database it takes 31 milliseconds. I casually asked him to select all the columns used in the query, commenting all other columns used in the WHERE clause and execute it on both the databases. The query executed was :

SELECT opnamt_gl, entdate, argdate, invtype, status, opnamt_gl
FROM ORDER_ALL
WHERE customer_id = :b1;

Though, the complete query fetches only 1 row on both the databases, when this modified query was executed on each of these, it revealed that on the problematic database, it is fetching approx 115 rows for each customer against 25 rows on another. It is this difference that is taking resource and the additional time.

This 115 rows fetch was taking 2 Seconds. The issue was 2 seconds for each customer and this was not acceptable. The challenge was to reduce this processing time for each customer. As an optimizing technique, we executed the query, only on an indexed column, i.e.customer_id, in the WHERE clause and we got 115 rows. I concentrated on the values in the other columns of the query that were used in the WHERE clause of the Original Query and was surprised to see that only 2 rows had a non zero opnamt_gl. The original query has a predicate opnamt_gl != 0. When asked, the developer replied that for all the customers, there will be only 1 or 2 rows with non zero value and this input helped me in recommending an interesting optimizing technique.

As mentioned earlier, the query fetches only 1 row, which means, out of the (approx) 115 rows for each customers, 114 rows are discarded. TKPROF also showed this behaviour. Therefore, instead of discarding 114 rows, why not get 2 rows from an index and discard 1 row. By now, the developer understood that we are planning to recommend a new composite index on (customer_id, opnamt_gl) and therefore, before we recommend anything, he himself mentioned that they tried creating this index but the response time did not improve. The reason they did not see any performance gain was the condition opnamt_gl != 0. The recommendation provided here gave a major performance relief. The steps involved for optimizing this piece of the code were :

  1. A Function based Index.
  2. Create a View. This is an additional step since the db version is 8174.
  3. Modify the query so that it queries the view.

We primarily targetted the Conditions

WHERE customer_id = :b1 and   opnamt_gl != 0;

We created a function based Index on (customer_id, case when opnamt_gl=0 then null else ‘Y’ end). Since the db version is 8174, CASE statements cannot be used in a pl/sql block, therefore we created a view as :

create view ORDER_ALL_VW as
SELECT opnamt_gl, entdate, argdate, customer_id
FROM ORDER_ALL
WHERE ((invtype = 5
AND status IN(‘IN’,’CM’))
OR (status IN(‘FC’,’CO’)))
AND (case when opnamt_gl=0 then null else ‘Y’ end) = ‘Y’;

Then the Query was modified as

SELECT opnamt_gl, entdate, argdate FROM ORDER_ALL_VW WHERE customer_id = :b1;

With this optimization, the query response time was 3 msec for a customer and with this the completion time of this query for 1 million customer came down from 7 hours to 7 minutes. This optimization also guaranteed that the process would not fail with ora-1555.

Definetely, a discussion on the optimizing strategy with the developer helped us optimize the performance of a critical query. Developers know their data well, and in my view, this recommendation should have come from the developers. For this, developers needs to be well conversant with database features and explore each of these while writing an application query.

The content of my Blog Posted on 20th April 2009 ends here.

Starting Oracle 9i, with System Statistics Collected and Oracle 10g (CPU Stats are Mandatory), a normal composite Index on customer_id, opnamt_gl helped optimizing the performance of this query. The issue, without system stats is, the optimizer cease to use an Index for any search on NOT EQUAL TO predicate and therefore, in 8174, even after creating a normal index yield no performance benefit.

While writing for AIOUG Journal, I created a test case to check for the performance of this query across each version. In Oracle 8i, the query was doing huge Logical Reads, Oracle 9i (with system stats) and 10g, the I/O’s dropped drastically. In Oracle 9i, without System Statistics, the behaviour was same as Oracle 8i.

Advertisements

Rowid based Parallelism (Manual) v/s Oracle Parallelism !

This blog was one of my most appreciated writeup, with a good amount of hits. Since, my previous blog url is unaccessible and I want my readers to have access to this blog, thought of re-posting it again. Another reason, to re-post this blog, is to make the dba’s, and more importantly, the developers aware of a very effective and failure free strategy, when it comes to accessing a Huge Table.

A background on a recent Purging Activity (Reason behing re-posting my previous blog)
On 17th August, I wrote on “Efficient Purging”, which discussed on purging data from a 60GB table. This purging was achieved in a very efficient and timely manner during my visit to a Customer in July 2009. Recently, came across a similar purging activity from a partitioned table of 75 GB in size. The strategy that I used, in this case, was Approach 1, as mentioned in my previous blog. Reason being, rows from all the partitions were to be accessed and therefore, CTAS seemed to be a better approach here. Since, I worked on two purging activities, blogs on these two topic were worth posting.

The recent purging activity was for one of my customer. August 15th, being a National Holiday in India, and this year, it coincided with a long weekend, the customer took a downtime of 24 hours to purge the data from some set of tables. At a very last moment, changing the strategy seemed to be non-feasible, therefore, customer went ahead with the pre-decided strategy of NORMAL Delete. While 90% of the tables were small in size, 8-9 tables were huge, especially one table that was 75 GB in size. This was a partitioned table. The biggest challenge, in this activity was, maintaining Data Consistency. This means, that once the activity starts and data from some of the tables is purged and commited, then the activity is either to be completed 100% or rolledback. Therefore, before the activity, backup of the entire database was taken.

Once you know the strategy used by the application team to get the data purged, you will get the link between my previous blog and this.

Strategy used and planned by the Application Team
The strategy planned by the Application Team for the huge tables were :

  1. Populate 5 Tables with the ROWID of the rows to be Purged.
  2. Drop all the Indexes from the to-be-purged table.
  3. Delete from the Table from 5 different sessions.
  4. Re-Org the Table and Create all required Indexes.

Each of the 5 tables, as mentioned in Step 1 above, were populated with “Total Rows to be Purged” / 5 number of rows, with a motive of achieving parallelism. These tables were populated before the downtime. Once these tables were populated, the DELETE statement was executed from 5 different sessions. Clearly, this logic is very resource intensive, time consuming and an inefficient way of achieving the goal, reasone being :

  1. The to-be-purged table, a huge table, is scanned five times.
  2. Each of these scans were Full Table Scans.
  3. Delete generates heavy redo and is not an efficient way to delete large number of rows.
  4. Table re-org and Index Creation will be required after the activity.

The most inefficient step was Full Table Scan of a huge table, and this is where the activity was once at a stage, where it was on a verge of roll back. The problem occured for one table of 75 GB and at this stage, Approach 1 i.e. Creating a New Table with the data to be retained, was adopted. The DELETE statements, from 5 different sessions, that ran for almost 4-5 hours, completed in less than 2 hours using CTAS approach.

Once the activity was over, customer requested a so-called learnings meeting to discuss, Why and Where the activity went wrong ? While the conceptual issues with NORMAL Delete, as mentioned in my previous and this blog, was explained, one step that took most of the time and seems to be one of the reason for this failure, was the way parallelism was adopted by the Application Team. This was during this review discussion that the ROWID Parallelism was explained to the Application Team as an efficient way of spawning multiple sessions. The benefit of ROWID parallelism is that it splits huge table into smaller chunks and each session works on its own chunk. This would have been also a better choice, but needs to be adopted based on the resources available. If I am short of resources, then, instead of using Oracle Parallelism, manual parallelism would be a better choice as it eliminates the chances of any failure. This is well explained in my ROWID based Parallelism blog posted on my previous Blog URL.

For the benefit of my readers, I am again posting this on this site and it explains the way I efficiently used this on a huge table and on a low capacity server.

The Extract from my BLOG

Recently, I had come across an interesting challenge of purging data from a 500gb table. This was not a plain PURGE but was based on a join and filtering on two other tables. The reason I term it as a challenge is because the database was on a low end server with 12 CPU and 52 GB RAM.The Original Query used for purging is as under :

create table wia_values_new_1 nologging parallel 24
tablespace A_TS_IDX as
select a.*from wia_values a,
(select a.it_type ,a.it_key ,b.name from wi a, wia b
where a.begin_date > ’01-NOV-2007′ and a.it_type=b.it_type) vv
where a.it_type=vv.it_type
and a.it_key=vv.it_key
and a.name=vv.name;

To accomplish this task, it was obvious to run the queries with parallel processes. On a 12 CPU Machine, the Nested Loop Join took more than 10-12 hours and had to be aborted as this much time was unacceptable. With a HASH Join, the query used to fail with UNABLE TO Extent Temporary Segments after 3-4 hours of execution. This means, restarting of the entire process.

Oracle 9i onwards, one can use dbms_resumable to take care of these errors. Once fixed, the process will continue from the point where it failed.

Finally, I recollected a very good methodology, called as ROWID Parallelism, introduced by Thomas Kyte. I read this long back in his (and one of my favourite) book “Effective Oracle by Design”, but have practically used and implemented this for the first time. Believe me, with this parallelism, I got a the tremendous performance benefit and could accomplish this challenging task in less than 2 hours with great ease.

Unfortunately, I could not use these on production because due to the challenge involved, the customer provided us with additional CPU’s and Memory, and a NESTED Loop query did the purge in 3 hours. But, based on the comparision, these rowid parallelism could have achieved the task in less than an hour and half.

The steps that I performed were as under :

  1. Split the table into multiple chunks (based on the number of rowid parallelism) that would be used. In my case, the extents of 500 GB table were scattered across 208 datafiles, hence, I splitted the table into 200 chunks.
  2. Create a Job_Table, to store the low and high rowids of each chunk. This table will store some additional data that can be used later.
  3. Spawn and Schedule parallel jobs using dbms_job. Each job will process its own chunk.
  4. Merge the data into one single table.

The advantage of using this parallelism is :

  1. Easier to split the table into multiple chunks.
  2. Easier to manage, based on the CPU Utilization.
  3. If anyone of the script or job fails, it is easier to restart that single job rather than restarting the entire process all again.

These are the queries that were used to accomplish this purging task using ROWID Parallelism.

Job_Table

create table job_table(
job_id		int primary key,
lo_rid		rowid,
hi_rid		rowid,
completed_yn	varchar2(1),
rows_processed	number(20),
start_time	date,
end_time	date);

Spawning and storing of multiple chunks into job_table

insert into job_table(job_id, lo_rid, hi_rid)
select  grp, dbms_rowid.rowid_create(1,data_object_id, lo_fno, lo_block,0) min_rid, 
	dbms_rowid.rowid_create(1,data_object_id, hi_fno, hi_block,0) max_rid
from	(select  distinct grp, first_value(relative_fno) over (partition by grp order by relative_fno, 
	block_id rows  between unbounded preceding and unbounded following) lo_fno, 
	first_value(block_id) over (partition by grp order by relative_fno, 
	block_id rows  between unbounded preceding and unbounded following) lo_block, 
	last_value(relative_fno) over (partition by grp order by relative_fno, 
	block_id rows  between unbounded preceding and unbounded following) hi_fno, 
	last_value(block_id) over (partition by grp order by relative_fno, 
	block_id rows  between unbounded preceding and unbounded following) hi_block, 
	sum(blocks) over (partition by grp) sum_blocks 
	from	(select  relative_fno, block_id, blocks, sum(blocks) over (order by relative_fno, block_id) cum_blocks, 
		sum(blocks) over () tot_blocks, trunc((sum(blocks) over (order by relative_fno, block_id)-0.01) / (sum(blocks) over ()/200)) grp
		from	dba_extents
		where	segment_name='WIA_VALUES'
		and	owner=user
		order by relative_fno, block_id)),
		(select data_object_id from dba_objects
		where	owner=user
		and	object_name='WIA_VALUES');
commit;

Now I have 200 rows in this table with low and high rowids that take care of the entire table.

GRP MIN_RID            MAX_RID
0   AABDksAALAAAAAJAAA AABDksAB2AAB9f5AAA         
1   AABDksAB2AAB9gZAAA AABDksACkAACSFJAAA         
2   AABDksACkAACSFZAAA AABDksADSAAAsxJAAA         
3   AABDksADSAAAsx5AAA AABDksADzAACT6pAAA         
4   AABDksADzAACT7JAAA AABDksAEkAAAADJAAA         
5   AABDksAEkAAAADpAAA AABDksAFMAAAKFZAAA         
6   AABDksAFMAAAKFpAAA AABDksAGmAAAYVJAAA         
7   AABDksAGmAAAYV5AAA AABDksAHTAAAnsJAAA         
8   AABDksAHTAAAnsZAAA AABDksAH6AACS0pAAA         
9   AABDksAH6AACS1JAAA AABDksAIeAABo65AAA        
10  AABDksAIeAABo+5AAA AABDksAJRAAAwmZAAA        
11  AABDksAJRAAAwnZAAA AABDksAJ1AACDppAAA

I created 200 tables in which, a parallel job will insert required rows into.

declare
l_statement varchar2(200);
begin
  for i in 0..199
  loop
  l_statement:='create table wia_new_'i' tablespace A_TS_IDX as select * from wia_values where 1=2';
  execute immediate l_statement;
  end loop;
end;

Then I created a Database Procedure that will be executed in parallel and the input value will be the job_id. Based on this input job_id, the insert script will pickup the low and high rowids and will do a rowid range scan of the table. Based on the status (completion or failure), the job_table will be updated.

create or replace procedure p_job(p_job in number) as
l_job_table  job_table%rowtype;
l_completed varchar2(1);
l_statement varchar2(2000);
l_rows  number(20):=0;
l_date  date;
begin
  select * into l_job_table from job_table
  where job_id=p_job
  and   nvl(completed_yn,'N')!='P'
  and	(completed_yn!='Y' or completed_yn is null);
  l_rows:=sql%rowcount;
  if l_rows>0 then 
    execute immediate 'alter session set db_file_multiblock_read_count=128'; 
    select sysdate into l_date from dual; 
    update job_table set completed_yn='P', start_time=l_date where job_id=p_job; 
    commit; 
    l_statement:='insert /*+ append */ into wia_new_'||p_job||' nologging select  a.* from  wia_values a, (select a.it_type ,a.it_key  ,b.name  from  wi a,wia b  where  a.begin_date > ''''''01-NOV-2007''''''  and  a.it_type=b.it_type) vv where  a.it_type=vv.it_type and  a.it_key=vv.it_key and  a.name=vv.name and a.rowid between'''''l_job_table.lo_rid''''' and'''''l_job_table.hi_rid''''; 
    begin 
       execute immediate l_statement; 
       l_rows:=sql%rowcount; 
       select sysdate into l_date from dual; 
       update  job_table set  completed_yn='Y',  rows_processed=l_rows,  end_time=l_date where  job_id=p_job; 
       commit;  
       exception when others then  
       update  job_table  set  completed_yn='F',   start_time=null  where  job_id=p_job;  
       commit; 
    end;
  end if;
end;

DBMS_JOB is used to schedule jobs at regular interval and based on the CPU Utilization.

declare
l_jobno number;
BEGIN
  for i in 0..30
  loop   
  DBMS_JOB.SUBMIT(l_jobno,'p_job('i');', trunc(sysdate)+16/24+00/(24*60));
  end loop;   
  COMMIT;
END;

Based on the Utilization, I spawned 30 parallel jobs. Each Job took 5-7 minutes to complete a rowid range scan of wia_values. This I observed from v$session_longops. Once the rowid range scan was complete, hash join and insertion took another 10 minutes. Hence, one job took almost 20-25 minutes. Hence, I scheduled another set of 30 jobs at 10 minute interval.The overall start time and end time, as an example, I followed was
10:00 – 10:25 30 Jobs,
10:10 – 10:35 30 Jobs,
10:35 – 11:00 30 Jobs (After 100% Completion of Previous 60 Jobs).
10:45 – 11:10 30 Jobs,
11:10 – 11:35 30 Jobs (After 100% Completion of Previous 60 Jobs)
11:20 – 11:45 30 Jobs
11:45 – 12:10 20 Jobs (Remaining)

A Sample output after completion of all the scripts…

JOB_ID	LO_RID             HI_RID             C	PROCESSED	START_TIME		END_TIME
0	AABDksAALAAAAAJAAA AABDksAB2AAB9f5AAA Y	29429239	02-03-2008 16:00:06	02-03-2008 16:19:26         
1	AABDksAB2AAB9gZAAA AABDksACkAACSFJAAA Y	208119		02-03-2008 16:00:06	02-03-2008 16:15:51         
2	AABDksACkAACSFZAAA AABDksADSAAAsxJAAA Y	199466		02-03-2008 16:00:06	02-03-2008 16:13:23         
3	AABDksADSAAAsx5AAA AABDksADzAACT6pAAA Y	10776162	02-03-2008 16:00:06	02-03-2008 16:18:48         
4	AABDksADzAACT7JAAA AABDksAEkAAAADJAAA Y	18897287	02-03-2008 16:00:06	02-03-2008 16:19:03

Conclusion : Oracle Parallelism uses same rowid parallelism mechanism to accomplish a task, but manual parallelism works well when you are short of resources, like, in my case, CPU’s and Temporary Space. Though, these require some manual work, it makes a task easier and faster.

END OF THE BLOG

How do I Purge Data Efficiently ?

Purging Data from a table or set of tables is a common practice and, in most of the cases, is done for Performance reason. “How do I efficiently purge the data ?” is a question normally asked to minimize the time it takes to purge the data. While there are various ways to purge data from a table, the time it takes to purge, depends on various factors, like :

  1. Strategy Selected
  2. Amount of Data to be Purged

In cases, where Purging is a frequency based regular activity, then Partitioning is an efficient solution, but then Performance needs to be evaulated. In most of the cases, implementing Partitioning for maintenance needs some compromise on Performance, as both cannot be achieved together. Moreover, in my opinion, Partitioning in an OLTP system will not yield much of the performance benefit as these queries are very lightweight and fetch very less data.

One time Purging, for performance, requires getting rid of data that is not accessed and therefore, can be safely removed. For this one time activity, partitioning for purging is obviously not a solution. In this case, one of the following strategy is commnly used and each of these have their own characteristics :

  1. Normal DELETE
  2. CTAS or INSERT INTO

While “Normal DELETE” is to be used for a very less %age of data, creating a new table with the data to be retained seems to be a best strategy for larger volumes. Again, the size of the table should also be taken into consideration before opting for any strategy. I will refrain from specifying any percentages (of data to be purged) to come up with a strategy, on the basis, that, there is no thumb rule to be considered for the same. The deciding factor has to be :

  1. Your own Knowledge about the Object
  2. Time and Resource it takes to Delete the Data
  3. Whether Normal Delete requires any Object (Table & Indexes) Re-org ?
  4. Time and Resouce it takes to Create a New Table and its corresponding Indexes.

Recently, one of my customer called upon to sought my opinion on a scheduled Purging Activity and to check the feasibility of optimization in the planned strategy.

SCENARIO
Data to be purged from a 50 GB partitioned table. The table is range partitioned, with each partition holding one months data. The partitions are from December 2004 till July 2009 and the data is to be purged, based on a WHERE condition, till March 2008. All the data from April 2008 till now is to be retained. So, of the total 68 Million rows till march 2008, total 54 Million rows have to be deleted. Total Number of rows in the table, as of current date, are 138 Million.

Assuming this to be a simple purging, the application team came up with the strategy of using NORMAL Delete, followed by a re-org. The Database size was not huge and the Server Capacity was 4 CPU Machine with 16 GB RAM. The delete statement to be used and the planned strategy for this activity was :

delete /*+ parallel(a,16) */ from itxn a
where	a.trdt <= to_date('31-Mar-2008 23:59:59','DD-Mon-YYYY HH24:MI:SS')
and	a.chid = 'TM';
  1. Drop all the Indexes : To Speed up the Deletion
  2. Delete from the Table : Expected to take 90 Minutes
  3. Table Re-Org : 30 Minutes
  4. Index Creation : 60 Minutes

The Total Downtime, as expected from the above timelines, is approx. 3 Hours. Based on the strategy, I could see wastage of resources (and time) and duplication of some steps. Eliminating this duplication, as explained in this blog, would optimize the purging process and accomplish the task in minimal downtime. For me, 90 Minutes for Deletion is Redundant and needs to be eliminated. Another step, that seems to be redundant is the Index Creation, but to speed up the purging, I thought of including this step and perform the activity in less than 1 hour.

OPTIMIZATION
The Critical Statistics for optimizing this purging process, as mentioned in the Scenario section above are :

  1. We have a table with Monthly Partitions
  2. Total Partitions : 56 (Dec 2004 to Jul 2009)
  3. Purging from 40 Partitions (Dec 2004 to March 2008)
  4. 16 Partitions to be untouched (April 2008 to Jul 2009)
  5. Total Rows in the Table 138 Million and to Purge 54 Million (39%)
  6. Out of 138 Million Rows, 68 Million rows are from Not-to-be-touched Partitions (49%)

Based on these Inputs, I had following two options, and the one that scores over another is the second approach.

APPROACH 1

  1. Create a New Partitioned Table with the data to be retained.
  2. Rename the OLD Table and all its Indexes
  3. Rename the New Table to ORIGINAL
  4. Create all necessary Indexes on the new Table.

APPROACH 2

  1. Create a New Non-Partitioned Table.
  2. Drop all the indexes from the to-be-purged table. (optional but was done)
  3. Insert the data to be retained from one partition at a time into the New Non-Partitioned Table.
  4. Exchange Partition.
  5. Create all necessary Indexes on the to-be-purged table.

The advantage of Approach 2 over Approach 1 is that 49% of the rows from not-to-be-touched partitions will not be accessed and therefore, this 2nd approach will take half the time as compared to Approach 1 as it access all the rows. Once this was finalized, it was just the matter of automating step 3 and 4 from approach 2. Automating eliminated any errors due to manual intervention. The automated pl/sql block is posted below :

create or replace procedure purge_process
as
cursor main is
select * from purge_job
where completed_yn is null
order by 1;
k_statement varchar2(30);
l_statement varchar2(1000);
l_count number;
begin
/* Get the total Number of Partitions to be accessed */
    select partition_position into p_count from dba_tab_partitions
    where table_owner=’GPI’
    and table_name=’ITXN’
    and partition_name=’ITXN_PART_MAR2008′;
    execute immediate ‘truncate table purge_job’;
    insert into ipgora.purge_job select partition_name, null from dba_tab_partitions
    where table_owner=’GIP’
    and table_name=’ITXN’
    and partition_position<=p_count;

 commit;

 dbms_application_info.set_module(module_name=>’Purging of ATM’,action_name=>’Job Table’);
    execute immediate ‘alter session set db_file_multiblock_read_count=128′;
    for i in main
      loop
       begin
         k_statement:=i.partition_name;
        dbms_application_info.set_module(module_name=>’Purging of TM’,action_name=>k_statement);
        execute immediate ‘truncate table ITXN_NEW’;
         /* Insert Rows to be Retained into Temp Table. Work on One Partition at a time. */
         /* Criteria ((Date<= ’31-Mar-2008′) and chid!=’TM’) To be Retained*/
         l_statement:=’insert /*+ append */ into ITXN_NEW select /*+ parallel(a,8) */ * from ITXN partition (‘||i.partition_name||’) a where (a.trdt     <= to_date(‘||””||’31-Mar-2008 23:59:59’||””||’,’||””||’DD-Mon-YYYY HH24:MI:SS’||””||’) and chid!=’||””||’TM’||””||’) OR (a.trdt > to_date(‘||””||’31-Mar-2008 23:59:59’||””||’,’||””||’DD-Mon-YYYY HH24:MI:SS’||””||’)))’;
        execute immediate l_statement;
        commit;
        l_statement:=’alter table itxn exchange partition ‘||i.partition_name||’ with table ITXN_NEW’;
        execute immediate l_statement;
        commit;
        update purge_job set completed_yn=’Y’ where partition_name=i.partition_name;
        commit;
        exception when others then
        update purge_job set completed_yn=’F’ where partition_name=i.partition_name;
        commit;
     end;
  end loop;
end;

On the production, the purging took 18 minutes and the Index creation took 45 Minutes. Therefore, the total downtime for the activity was approximately 1 hour. The steps involved, as per the procedure created were :

  1. Create a purge_job table for monitoring the progress.
  2. Get the Partition_position from dba_tab_partitions for the Last Partition to be considered. In this case, March 2008.
  3. Insert partition_name and NULL into purge_job table.
  4. Truncate the Temporary Table created for this purpose.
  5. In a Loop, get one partition at a time and insert the rows to be retained into Temporary Table.
  6. Exchange the current partition (from where the rows are fetched) with Temporary table. At this stage, the Temporary table will contain all the rows from the current partition and the current partition will contain the rows to be retained.
  7. Update the Purge_job table with Completed_YN status as ‘Y’. This marks this current partition as completed.
  8. Proceed with next partitions…continue till the March 2008 Partition as fetched into P_COUNT variable.

With this approach, the goal (purging) was accomplished in an efficient manner.

For unknown reasons, I am not able to access my previous blogspot url http://viveklsharma.blogspot.com. I have a blog, posted in March 2008, where I used ROWID Parallelism to accomplish a challenging purging from a 400 GB table in less than 3-4 hours. This was a non-partitioned table and based on the available resources, rowid parallelism (as introduced by Thomas Kyte) was effectively used.

This is my new Blog URL

Hi,

I have changed from blogspot to wordpress.com. For unknown reason, my previous blogspot site http://viveklsharma.blogspot.com is not accessible. I have tried contacting support to get the issue resolved, but have not yet received any response. I cannot keep away from writing, so thought of switching over to this.

While, I would try to get my previous blogspot re-activated, for any reason, if this is not done, for the benefit of my reasone, I will repost some of the important and well appreciated write-ups on this site.

%d bloggers like this: