Wednesday, July 30, 2008

 

Scripts for loading up DBGEN TPC-H data

If you're interested in creating a TPC-H schema for testing purposes, then the following scripts may be of use to you:

Unix Scripts:
Multi Load TPCH
Load TPCH Stream

SQL*Loader Control files:
REGION Table
NATION Table
SUPPLIER Table
CUSTOMER Table
PART Table
PARTSUPP Table
ORDERS Table
LINEITEM Table

You may wish to read and check them before you use them - they're not exactly rocket science but they seem to do the job.

I have all the files in the same directory for simplicity sake.

I then use them to create a scale factor 1 target TPC-H schema using the following calls:

# "l" loads the REGION and NATION tables
./multi_load_tpch.sh 1 l "tpch/tpch@test" 1

# "s" loads the SUPPLIER table
./multi_load_tpch.sh 1 s "tpch/tpch@test" 10

# "c" loads the CUSTOMER table
./multi_load_tpch.sh 1 c "tpch/tpch@test" 10

# "p" loads the PART and PARTSUPP tables
./multi_load_tpch.sh 1 p "tpch/tpch@test" 10

# "o" loads the ORDERS and LINEITEM tables
./multi_load_tpch.sh 1 o "tpch/tpch@test" 10

Obviously, you need to change the connection string to match your environment.

Caveats:

  1. Obviously, they are supplied as is - use at your own discretion and risk.
  2. You need to have created the target schema tables and made sure they are empty as the SQL*Loader control files use APPEND.
  3. Bear in mind that choosing too high a number of parallel streams (the last parameter in the calls) will overload your machine so try and balance it against the available system resources.

Bugs, issues or questions, please get in touch...enjoy.

Labels: , , , ,


Tuesday, April 15, 2008

 

Oracle Optimized Warehouse Initiative (OWI)

I enjoyed a trip out of the office today with my manager. We went down to the Oracle Enterprise Technology Centre, in Reading, to hear about the Oracle Optimized Warehouse Initiative. It was basically a half day pitch from Oracle and Sun today, although there are other dates with different hardware vendors (IBM, HP and Dell).

It was an interesting day, although I didn't really hear anything new, per se. I think the main things I took away from the session were:

  • The availability of a set of reference configurations providing matrices which cover various permutations of user count, storage cost focus, warehouse size and hardware vendor.
  • The possibility of using the reference configurations either minimally, to simply cross check a proposed hardware specification for a given workload, to ensure it seems appropriate, or going the whole hog and using an "out of the box" reference configuration, delivered to your office, fully configured with all software installed, in a box, ready to run in your data.
  • Oracle are pushing RAC and ASM heavily in the DW space - no surprise there.
  • HammerOra and ORION are used by Oracle and the hardware vendors to assess the reference configurations...and there is nothing stopping you using them for your own benchmarking efforts

It was interesting to hear about the Proof Of Concept facility that Sun has, in Linlithgow, Scotland. The facility allows Sun (and an Oracle customer) to take a server off the production line and, working with the customer, test their workload on that machine to see if it's going to work or not. Neat, and since we're going to be using some Sun kit for a DW shortly, it sounds like an opportunity.

Funniest thing of the day for me, was the last slide in the pitch given by Mike Leigh of Sun which had the title "The Power of 2" and was illustrating the benefits to customers of Oracle and Sun as a united force. I didn't really take much notice, as I was too busy smiling, as I looked at the title and it made me think of Doug and his Parallel Execution and the 'Magic of 2' paper (the Magic of 2 bit actually being from this paper by Cary).

If you're building a warehouse, or just want to get an idea of whether your hardware is appropriate for the job, it's probably worth reading up on the OWI.

Labels: , ,


Tuesday, May 29, 2007

 

Is ASSM evil in a DW?

I saw an interesting article on ASSM from Howard the other day. I wanted to comment on it but unless I'm being a bit thick, it doesn't seem to be a blogpost and therefore I couldn't and instead, I thought I'd do it here.

As I said, I found the article interesting and informative but I did have a few "But what about...?" issues that popped into my head as I read through it.

It probably won't surprise anyone who reads my blog that my queries relate to the use of ASSM on a datawarehouse environment given my interest in that area.

Firstly, Howard asserted that ASSM wouldn't be helpful in scenarios where you recieved sorted data from a warehouse source which you wanted to load up into a target table in order that you could gain a performance benefit from creating an index on that data whilst taking advantage of the NOSORT option - thereby speeding up the index creation considerably. Nothing Howard said was untrue although there are a number of things that could be discussed in relation to this point...

Firstly, in my experience, most warehouse processing uses Oracle parallel execution for performance - but if one were to load up data into a target table using parallel processing, then even if it were sorted in the source, the use of parallel processing would result in the target being unsorted and hence this advantage would be lost. Of course you could resort to serial processing but would that be more efficient than doing the work in parallel and then sorting the data for the index build again using parallel processing techniques - each process is probably different but I guess what I'm saying is that it's possible that the NOSORT issue is not necessarily relevant in all cases especially those on a DW using parallel execution.

There are a number of caveats with the use of the NOSORT option such as:

You cannot specify REVERSE with this clause.
You cannot use this clause to create a cluster index partitioned or bitmap index.
You cannot specify this clause for a secondary index on an index-organized table.

Read more here

The use of Direct Path operations on a DW might also mean that the space wastage doesn't occur given Oracle constructs full blocks from clean ones rather than via free list management or the ASSM equivalent.

Howard suggested that tests he has undertaken showed a 1.5% excess space utilisation when ASSM was involved on a 1Tb table. A long time ago I would have been horified to lose 15Gb but these days I probably wouldn't be as concerned, particularly as I often find ways to save that kind of storage just as easily in other areas, e.g. Ensuring we use compression appropriately for tables and indexes and ensuring we only have indexes we need and which are of the right type and structure. If the solution to save that 1.5% were simple then I'd consider that too - not using ASSM sounds like a simple solution but I guess it depends on what the DBA standards are for the organisation amongst other factors.

Howard mentioned that with ASSM "wasting" space and using more blocks to store the same number of rows, that would mean a full scan would need to read more blocks for a given table and that the numerous, repeatedly accessed and therefore "hot", ASSM bitmap blocks could lead to other, "warm" data being aged out sooner than would otherwise be the case - It's a reasonable hypothesis and if we were full scanning a 1Tb table that had 1.5% or 15Gb of "waste" - I'm making a big assumption due to my lack of research here, in guessing that the 15Gb is largely made up of these ASSM bitmap blocks - then yes, 15Gb of blocks could have a considerable impact on a buffer cache...but that's assuming you'd be reading the whole 1Tb table via a full scan - one would imagine the table is partitioned by some reasonable key (hopefully including time) and that therefore you'd only be reading a much smaller subset with the consequent reduced effect on the buffer cache. The effect is there but it's impact would be system/query dependent.

Since the read (full scan) of the table itself would result in blocks going on the LRU end of the LRU list for the buffer cache then that wouldn't affect the caching of hot/warm data at all - it's just a matter of whether the ASSM bitmap blocks would be aging out other "warm" data to the detriment of overall system performance.

Whilst it raised a number of queries, it's definitely the case that the article Howard posted has raised quite a debate within the DBA team on the warehouse I'm currently working on...should be fun this week on the warehouse as we contemplate peforming a number of benchmarks to try and ascertain which way we should jump - if any!

On another note, even though he's moved on to pastures new, I think Doug Burns must have had something to do with the latest Sky TV adverts as the phone number they wanted you to call was 08702 42 42 42 - now that's what I call leaving a lasting impression!

Labels: ,


Thursday, February 01, 2007

 

Good to be back in the 21st Century...and identifying PEL mismatches

Well, I can honestly say, that moving house over the last few months has been an experience I don't wish to repeat for a good few years - if not decades! I've only just got my broadband enabled for the new place - hence my first post in ages - and that was a miracle given the state of what they very loosely call "customer service" at BT.

A few things I've learned in the process of moving house:

  • Never choose BT to assist you with any communications infrastructure
  • - phone line, VOIP, Broadband etc...the list of problems I've encountered whilst getting the simple task of telephone line and broadband installed has been literally staggering.
  • Never move in with your in laws when you are "between houses"...unless they are a nice bunch of people in which case for a short period of time (we spent 5 weeks there) it can be quite workable, even if there were some opinions at polar extremes to negotiate around at times - how exactly do you get a pro hunting person and a vegetarian to live happily under one roof?! Thankfully the Christmas spirit got us all through it and we're happily esconced in or new place.
  • Don't even think about using BT to get a phone line installed - it will take at least 3 weeks and they will probably lose your order at least twice.
  • Never trust Estate Agents - ours misinformed us about one particular aspect of our purchase and very nearly stopped it happening because of it - miscommunication is a highly dangerous thing.
  • Never ask BT to install your broadband - I won't go into massive details but basically they lost my first order and then handled my second one in an inept, dismissive and unprofessional manner...and believe me, if you've ever tried to use their IVR system to get through to the right person you'll understand just how frustrating it can be to be passed from pillar to post, explain your situation for five minutes only to be told that you have been talking to the wrong department (Exactly how I managed to get through to a Residential Sales person when I called a Business Sales line is quite beyond me!)
  • Take more care when you pack stuff away...otherwise you'll end up turning the new house upside down looking for things you can't find only to find they are tucked away in an incorrectly marked box...it would have helped if we'd been more involved with the packing people from the removals company in hindsight.

Now, I'm not going to just post a social thing as I quite understand the points being made recently on other blogs about "Sorry I've not posted in ages but...", so I thought I'd say something about problems when you're partition exchange loading. I built this routine a while back to help diagnose why we were unable to exchange partitions during a Partition Exchange Load process - it doesn't check every scenario but does cover a number of common ones such as:

  • IOT's with different overflow characteristics can't be exchanged
  • Hakan factors must be the same
  • Columns must be the same - in my view, that includes column names - this script will find issues like this one - even though it won't actually stop the PEL.
  • Constraints must be the same
  • Indexes must be the same - at least if they do if you are INCLUDING INDEXES in the PEL operation.

I'm sure it's not perfect but if anyone spots any issues with it then feel free to let me know and I'll check and fix the code.

We've found it quite useful - your mileage, as they say, may vary.

There are a number of links on this blog site that were pointing at my old web hosting site (www.oramoss.demon.co.uk) - that's dead now - I made the mistake of cancelling that and moving to BT - I mentioned that right? Anyway, I've fixed some of the posts and the presentation links - I'll fix anything else as I come across it.

Labels: ,


Wednesday, September 27, 2006

 

Partition exchange loading and column transposing issue

I came across an interesting issue yesterday whereby a partition exchange load routine wasn't working for my colleague Jon. The ensuing investigation seems to point at the possibility of a bug, whereby after exchanging a partition Oracle may have inadvertently transposed columns in the target table. I've created a test script to illustrate the problem and logged an SR with Metalink to see what they make of it.

In light of my recent post on DBMS_APPLICATION_INFO I'd better state that we found the problem on 10.2.0.2 on HP-UX and then tested it against 9.2.0.6 on HP-UX and found the same results so it seems the issue has been there a while in terms of Oracle releases.

The investigation started down a different avenue - as they often do - with the inability to complete a partition exchange due to the Oracle error:

ORA-14098: index mismatch for tables in ALTER TABLE EXCHANGE PARTITION

Reissuing the exchange partition command with the EXCLUDING INDEXES clause allowed the exchange to proceed so we figured there was definitely a problem with the indexes.

Next we tried dropping all the indexes on both sides of the exchange and recreating them...but when we retried the exchange partition it failed again with the same error.

Right, drastic measures time...we dropped all the indexes on both sides and ran the exchange which, with no indexes around, worked fine (with INCLUDING INDEXES clause in place as it would normally be on this process).

Next we proceeded to add the indexes one by one, starting with the b trees for the primary key and unique keys. This worked without a problem so we moved on to the bitmap indexes which seemed to be fine for the first few but then after adding a few of them it suddenly stopped working. Through a process of trial and error we were able to determine that if two specific bitmap indexes were on the tables then the exchange would fail with ORA-14098.

So, what was so special about those two bitmaps and the columns they were on. We selected the details from dba_tab_columns for the two columns involved in these bitmap indexes and realised that the only difference was that they were at a different column position (COLUMN_ID) on the tables - which begged the question "If these two tables are not quite the same then how can we be allowed to exchange them?"

Well, I guess, in our case, we were able to say that the two objects being exchanged had the same columns but just not necessarily in the same order (those of you who love classic comedy will I'm sure be recalling Morecambe and Wise with Andre Previn right about now)...should this mean we can swap the tables or not ?

To try and illustrate the scenario I built a test script...the output from a run follows...sorry it's a bit long...but it does prove some useful conclusions (I think)



> @test_pel_bug_mismatch_columns
> REM Drop the tables...
> DROP TABLE jeff_test
/

Table dropped.

> DROP TABLE jeff_test_ptn
/

Table dropped.

> REM Create the tables...
> CREATE TABLE jeff_test_ptn(partition_key_col NUMBER NOT NULL
2 ,column1 NUMBER NOT NULL
3 ,column3 NUMBER NOT NULL
4 ,column2 NUMBER NOT NULL
5 ,column4 NUMBER NOT NULL
6 )
7 PARTITION BY RANGE (partition_key_col)
8 (
9 PARTITION p001 VALUES LESS THAN (1000001)
10 ,PARTITION p002 VALUES LESS THAN (MAXVALUE)
11 )
12 /

Table created.

> CREATE TABLE jeff_test(partition_key_col NUMBER NOT NULL
2 ,column1 NUMBER NOT NULL
3 ,column2 NUMBER NOT NULL
4 ,column3 NUMBER NOT NULL
5 ,column4 NUMBER NOT NULL
6 )
7 /

Table created.

> REM Populate the table...
> INSERT /*+ APPEND */
2 INTO jeff_test_ptn(partition_key_col
3 ,column1
4 ,column2
5 ,column3
6 ,column4
7 )
8 SELECT idx
9 , MOD(idx,2)
10 , MOD(idx,3)
11 , MOD(idx,4)
12 , MOD(idx,2)
13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL <> COMMIT
2 /

Commit complete.

> INSERT /*+ APPEND */
2 INTO jeff_test(partition_key_col
3 ,column1
4 ,column2
5 ,column3
6 ,column4
7 )
8 SELECT idx
9 , MOD(idx,2)
10 , MOD(idx,5)
11 , MOD(idx,6)
12 , MOD(idx,2)
13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL <> COMMIT
2 /

Commit complete.

> REM Count the rows in COLUMN2 and COLUMN3 for both tables...
>
> REM Should see 3 rows...all NUMBERS
> SELECT column2,COUNT(1) FROM jeff_test_ptn GROUP BY column2;

COLUMN2 COUNT(1)
---------- ----------
1 12000
2 12000
0 12000

3 rows selected.

>
> REM Should see 4 rows...all NUMBERS
> SELECT column3,COUNT(1) FROM jeff_test_ptn GROUP BY column3;

COLUMN3 COUNT(1)
---------- ----------
1 9000
2 9000
3 9000
0 9000

4 rows selected.

>
> REM Should see 5 rows...all NUMBERS
> SELECT column2,COUNT(1) FROM jeff_test GROUP BY column2;

COLUMN2 COUNT(1)
---------- ----------
1 7200
2 7200
4 7200
3 7200
0 7200

5 rows selected.

>
> REM Should see 6 rows...all NUMBERS
> SELECT column3,COUNT(1) FROM jeff_test GROUP BY column3;

COLUMN3 COUNT(1)
---------- ----------
1 6000
2 6000
4 6000
5 6000
3 6000
0 6000

6 rows selected.

>
> REM Now lets try and swap the partition and the table...
> ALTER TABLE jeff_test_ptn
2 EXCHANGE PARTITION p001
3 WITH TABLE jeff_test
4 INCLUDING INDEXES
5 WITHOUT VALIDATION
6 /

Table altered.

>
> REM Surprisingly, it lets us do the above operation without complaining.
> REM Even worse...it transposes the values in COLUMN2 and COLUMN3...
>
> REM Should see 5 rows...but we see 6 rows
> SELECT column2,COUNT(1) FROM jeff_test_ptn GROUP BY column2;

COLUMN2 COUNT(1)
---------- ----------
1 6000
2 6000
4 6000
5 6000
3 6000
0 6000

6 rows selected.

>
> REM Should see 6 rows...but we see 5 rows
> SELECT column3,COUNT(1) FROM jeff_test_ptn GROUP BY column3;

COLUMN3 COUNT(1)
---------- ----------
1 7200
2 7200
4 7200
3 7200
0 7200

5 rows selected.

>
> REM Should see 3 rows...but we see 4 rows
> SELECT column2,COUNT(1) FROM jeff_test GROUP BY column2;

COLUMN2 COUNT(1)
---------- ----------
1 9000
2 9000
3 9000
0 9000

4 rows selected.

>
> REM Should see 4 rows...but we see 3 rows
> SELECT column3,COUNT(1) FROM jeff_test GROUP BY column3;

COLUMN3 COUNT(1)
---------- ----------
1 12000
2 12000
0 12000

3 rows selected.

>
> REM Now, lets try again but with COLUMN2 and COLUMN3
> REM being of different datatypes...
>
> REM Drop the tables...
> DROP TABLE jeff_test
2 /

Table dropped.

> DROP TABLE jeff_test_ptn
2 /

Table dropped.

> REM Create the tables...
> CREATE TABLE jeff_test_ptn(partition_key_col NUMBER NOT NULL
2 ,column1 NUMBER NOT NULL
3 ,column3 NUMBER NOT NULL
4 ,column2 DATE NOT NULL
5 ,column4 NUMBER NOT NULL
6 )
7 PARTITION BY RANGE (partition_key_col)
8 (
9 PARTITION p001 VALUES LESS THAN (1000001)
10 ,PARTITION p002 VALUES LESS THAN (MAXVALUE)
11 )
12 /

Table created.

> CREATE TABLE jeff_test(partition_key_col NUMBER NOT NULL
2 ,column1 NUMBER NOT NULL
3 ,column2 DATE NOT NULL
4 ,column3 NUMBER NOT NULL
5 ,column4 NUMBER NOT NULL
6 )
7 /

Table created.

> REM Populate the table...
> INSERT /*+ APPEND */
2 INTO jeff_test_ptn(partition_key_col
3 ,column1
4 ,column2
5 ,column3
6 ,column4
7 )
8 SELECT idx
9 , MOD(idx,2)
10 , SYSDATE + MOD(idx,3)
11 , MOD(idx,4)
12 , MOD(idx,2)
13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL <> COMMIT
2 /

Commit complete.

> INSERT /*+ APPEND */
2 INTO jeff_test(partition_key_col
3 ,column1
4 ,column2
5 ,column3
6 ,column4
7 )
8 SELECT idx
9 , MOD(idx,2)
10 , SYSDATE + MOD(idx,5)
11 , MOD(idx,6)
12 , MOD(idx,2)
13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL <> COMMIT
2 /

Commit complete.

> REM Count the rows in COLUMN2 and COLUMN3 for both tables...
>
> REM Should see 3 rows...all DATES
> SELECT column2,COUNT(1) FROM jeff_test_ptn GROUP BY column2;

COLUMN2 COUNT(1)
-------------------- ----------
27-SEP-2006 11:49:03 12000
28-SEP-2006 11:49:03 12000
26-SEP-2006 11:49:03 12000

3 rows selected.

>
> REM Should see 4 rows...all NUMBERS
> SELECT column3,COUNT(1) FROM jeff_test_ptn GROUP BY column3;

COLUMN3 COUNT(1)
---------- ----------
1 9000
2 9000
3 9000
0 9000

4 rows selected.

>
> REM Should see 5 rows...all DATES
> SELECT column2,COUNT(1) FROM jeff_test GROUP BY column2;

COLUMN2 COUNT(1)
-------------------- ----------
30-SEP-2006 11:49:03 7200
27-SEP-2006 11:49:03 7200
28-SEP-2006 11:49:03 7200
29-SEP-2006 11:49:03 7200
26-SEP-2006 11:49:03 7200

5 rows selected.

>
> REM Should see 6 rows...all NUMBERS
> SELECT column3,COUNT(1) FROM jeff_test GROUP BY column3;

COLUMN3 COUNT(1)
---------- ----------
1 6000
2 6000
4 6000
5 6000
3 6000
0 6000

6 rows selected.

>
> REM Now lets try and swap the partition and the table...
> REM It will fail with error...
> REM ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION
>
> ALTER TABLE jeff_test_ptn
2 EXCHANGE PARTITION p001
3 WITH TABLE jeff_test
4 INCLUDING INDEXES
5 WITHOUT VALIDATION
6 /
ALTER TABLE jeff_test_ptn
*
ERROR at line 1:
ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION


>
> REM So, it only fails when the columns have the same datatypes.
>
> REM Now, lets say they are the same datatype and look at how bitmap indexes
> REM are affected in the PARTITION EXCHANGE process...
>
> REM First lets recreate the tables with COLUMN2 and COLUMN3
> REM having the same datatype...
> REM Drop the tables...
> DROP TABLE jeff_test
2 /

Table dropped.

> DROP TABLE jeff_test_ptn
2 /

Table dropped.

> REM Create the tables...
> CREATE TABLE jeff_test_ptn(partition_key_col NUMBER NOT NULL
2 ,column1 NUMBER NOT NULL
3 ,column3 NUMBER NOT NULL
4 ,column2 NUMBER NOT NULL
5 ,column4 NUMBER NOT NULL
6 )
7 PARTITION BY RANGE (partition_key_col)
8 (
9 PARTITION p001 VALUES LESS THAN (1000001)
10 ,PARTITION p002 VALUES LESS THAN (MAXVALUE)
11 )
12 /

Table created.

> CREATE TABLE jeff_test(partition_key_col NUMBER NOT NULL
2 ,column1 NUMBER NOT NULL
3 ,column2 NUMBER NOT NULL
4 ,column3 NUMBER NOT NULL
5 ,column4 NUMBER NOT NULL
6 )
7 /

Table created.

> REM Populate the table...
> INSERT /*+ APPEND */
2 INTO jeff_test_ptn(partition_key_col
3 ,column1
4 ,column2
5 ,column3
6 ,column4
7 )
8 SELECT idx
9 , MOD(idx,2)
10 , MOD(idx,3)
11 , MOD(idx,4)
12 , MOD(idx,2)
13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL <> COMMIT
2 /

Commit complete.

> INSERT /*+ APPEND */
2 INTO jeff_test(partition_key_col
3 ,column1
4 ,column2
5 ,column3
6 ,column4
7 )
8 SELECT idx
9 , MOD(idx,2)
10 , MOD(idx,5)
11 , MOD(idx,6)
12 , MOD(idx,2)
13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL <> COMMIT
2 /

Commit complete.

>
> REM Now lets create a bitmap index on COLUMN1 on both tables...
> CREATE BITMAP INDEX jtp1 ON jeff_test_ptn(column1) LOCAL
2 /

Index created.

> CREATE BITMAP INDEX jt1 ON jeff_test(column1)
2 /

Index created.

>
> REM ...and now try PARTITION EXCHANGE...
> ALTER TABLE jeff_test_ptn
2 EXCHANGE PARTITION p001
3 WITH TABLE jeff_test
4 INCLUDING INDEXES
5 WITHOUT VALIDATION
6 /

Table altered.

> REM It works fine.
>
> REM Now lets create a bitmap index on COLUMN4 on both tables...
> CREATE BITMAP INDEX jtp4 ON jeff_test_ptn(column4) LOCAL
2 /

Index created.

> CREATE BITMAP INDEX jt4 ON jeff_test(column4)
2 /

Index created.

>
> REM ...and now try PARTITION EXCHANGE...
> ALTER TABLE jeff_test_ptn
2 EXCHANGE PARTITION p001
3 WITH TABLE jeff_test
4 INCLUDING INDEXES
5 WITHOUT VALIDATION
6 /

Table altered.

>
> REM It works fine.
>
> REM Now lets create a bitmap index on COLUMN2 on both tables...
> CREATE BITMAP INDEX jtp2 ON jeff_test_ptn(column2) LOCAL
2 /

Index created.

> CREATE BITMAP INDEX jt2 ON jeff_test(column2)
2 /

Index created.

>
> REM ...and now try PARTITION EXCHANGE...
> ALTER TABLE jeff_test_ptn
2 EXCHANGE PARTITION p001
3 WITH TABLE jeff_test
4 INCLUDING INDEXES
5 WITHOUT VALIDATION
6 /
WITH TABLE jeff_test
*
ERROR at line 3:
ORA-14098: index mismatch for tables in ALTER TABLE EXCHANGE PARTITION





Right, now what can we conclude from the above ?

1. If you don't have indexes on your table then you can do an exchange even if the columns are in a different order - but you will silently transpose the columns. No errors are given and I think this is a bug.
2. You will only hit the problem in 1 if the columns which are transposed are of the same datatype - I've not checked for scale/precision/length - it may be, for example, that a VARCHAR2(20) and VARCHAR2(10) being transposed would raise an ORA-14097 error.
3. If you have indexes on columns which are transposed then the exchange will fail with ORA-14098. I don't know whether this is a bitmap index specific thing as I've not tested it any further.
4. If you only have indexes on columns which are not transposed then you can do the exchange and there will be no errors - but your data is obviously transposed silently.

Labels:


Friday, July 14, 2006

 

TRUNCATE command marks previously UNUSABLE indexes as USABLE

Yes, I'm back...but on an adhoc basis as I still don't have enough time for this really! Yeah - I know, you're all in the same boat ;-)

I came across this problem on the warehouse the other day and was slightly surprised by what was going on...

We have an OWB mapping which does a TRUNCATE and then INSERT APPEND into a target table. The target table has some bitmap indexes on it. We run a little preprocess before calling the mapping to mark the bitmaps as unusable so it doesn't maintain them and then we (hope to) get better performance during the INSERT APPEND...we then rebuild the indexes afterwards more efficiently than maintaining them on the fly...sounds simple enough...except that performance was dragging...so I had a quick shufty and observed that the indexes were still marked as USABLE. I asked the guy running the stuff to check the run script and he said "No, we marked them as UNUSABLE before we ran the mapping - I'm sure of it".

That's strange I thought...time for some debug...which proved that we were setting the indexes UNUSABLE but somehow by the time the mapping was running they had miraculously been marked as USABLE.

I decided to create a simple test script to run in SQL*Plus which would be doing the same things as the OWB mapping - to try and repeat it at the SQL level and also to prove/disprove whether OWB was having some affect.

It's a useful thing to do this in any case as it's helpful if you're going to end up:

A) Testing it on other platforms/versions
B) Sending it to Oracle Support via an SR.

(Excuse formatting - either Blogger is crap or I don't understand how to get this to look nice -take your pick)


drop table j4134_test_bmi;

create table j4134_test_bmi(col1 number not null
,col2 number not null);

create unique index jtb_pk_i on j4134_test_bmi(col1);

alter table j4134_test_bmi add constraint jtb_pk primary key(col1) using index;

create bitmap index jtb_bmi on j4134_test_bmi(col2);

select index_name,index_type,status,partitioned from user_indexes where table_name='J4134_TEST_BMI';

alter index jtb_bmi unusable;

alter table j4134_test_bmi disable constraint jtb_pk;

alter index jtb_pk_i unusable;

select index_name,index_type,status,partitioned from user_indexes where table_name='J4134_TEST_BMI';

truncate table j4134_test_bmi;

select index_name,index_type,status,partitioned from user_indexes where table_name='J4134_TEST_BMI';

insert /*+ append */ into j4134_test_bmi(col1,col2)
select rownum
, 100
from (select level l from dual connect by level <>

commit;

select index_name,index_type,status,partitioned from user_indexes where table_name='J4134_TEST_BMI';


A section of the output of the script follows:

Index IndexName Type STATUS PAR
------------------------------ ------ -------- ---
JTB_PK_I NORMAL UNUSABLE NO
JTB_BMI BITMAP UNUSABLE NO

2 rows selected.

Elapsed: 00:00:00.06

Table truncated.

Elapsed: 00:00:00.11

Index IndexName Type STATUS PAR
------------------------------ ------ -------- ---
JTB_PK_I NORMAL VALID NO
JTB_BMI BITMAP VALID NO

Which shows that immediately after the TRUNCATE command was issued, the previously UNUSABLE indexes were suddenly marked as USABLE!

Why ? It's not documented anywhere that this would happen...so I created an SR with Oracle Support who initially created a Bug for it...but Development came back to say that it is "Expected behaviour". The reasoning Support give for this being expected is that when the Table is truncated, the associated Index segments are also truncated...and if they have no data in them then they can't possibly be UNUSABLE so they get marked as USABLE as a side effect of the process.

It's actually a reasonable assumption for them (Oracle development) to make to be fair - but it would have helped if it was documented - they're going to create a Metalink support note to document the issue.

Thanks to Doug for pushing me to blog this - I'll have to buy you a drink at the OUG Scotland Conference 2006!

Labels: , ,


Monday, February 20, 2006

 

RAC and data warehouses - Are they compatible ?

I watched one of the presentations of the Audio CD from UKOUG Conference 2005 – Getting the most of of RAC on Linux by James Anthony and was interested to see his comments that reducing the block size / reducing the rows per block can help in tuning the interconnect between the instances of a RAC environment.

Basically, James was saying how reducing the number of rows per block, perhaps by reducing the block size or using a higher PCTFREE value, would help to reduce block contention at the cost of using more space. This would in turn make full scans slightly worse but he suggested that full table scans can be hard on a RAC environment anyway and are therefore best avoided….so I’m wondering how a RAC approach fits with a data warehouse, where full scans are reasonably common (albeit with some degree of partition pruning hopefully!) and also where we might use compression, PCTFREE 0 and large block sizes to maximise the rows per block.

It sounds like there might be elements of RAC that don’t fit well with warehousing – but I’m guessing it’s not necessarily black and white and that things can be designed/planned/managed appropriately.

We don’t use RAC on the system I’m working on currently and I’m not particularly experienced with RAC, so if anyone has any opinions I’d be interested to hear them.

Labels: ,


Tuesday, February 14, 2006

 

Five Tuning Tips For Your Data Warehouse - A Presentation

I'd just realised that I hadn't put the final draft of this presentation that I gave at the UKOUG BIRT SIG in January on the web - it's now available here.

If you don't know Mark Rittman the "humour" at the beginning might go over your head - I guess you just had to be there!

Comments/discussion welcome.

Labels: ,


Monday, October 03, 2005

 

SCD2's and their affect on the CBO

We've got lots of SCD2 type tables in our warehouse and I've been wondering about how much affect it has on the CBO. Essentially, my concern is that when you query an SCD2 you generally look for records as they were on a specific date - the analysis date as we call it. You end up writing a predicate such as:

and [analysis_date] BETWEEN from_date and to_date

Now, how many rows will the optimiser think are going to be returned from the table ?

I figured that topic might have already been investigated by somebody so I did a search on google which was interesting:






What a pleasant surprise to find a reference from my own blog (via orafaq) showing as the number 1 hit...and Mark Rittman also shortly after! Unfortunately my own post was on a different matter and Mark's was too so I was still a bit stumped...

After a bit more research I found something from Wolfgang Breitling on this subject which confirmed my thoughts and discussed it very eloquently along with other fallacies of the CBO.

Now - Wolfgang tells us here what the problem is and that there isn't really a remedy other than using hints or stored outlines to guide the CBO...and who am I to argue!

I did think that maybe we could create some interface tables to hold all the possible range to date permutations and then when a user queries for a given analysis date they could use an equality predicate on the interface tables which would convert to pairs of from/to dates which then get equality matched to the target SCD2 - it kinda works but it means a lot of work to crunch through the interface tables just to avoid the problem of the CBO not being able to work out the selectivity/cardinality and potentially making a bad plan choice. The more the possible permutations the more work it becomes and in reality the number of permutations seems to be prohibitive so I've binned that idea for our environment. I might try to catch up with Wolfgang at the UKOUG to discuss this one further if I can grab his attention.

Addendum - One of Wolfgang's suggestions in his paper is to artificially set the stats on the table to some large number so that even when it factors the number down for the probability calculation it does then the number is still large and it will consequently choose hash/merge joins over a nested loop index lookups approach. I tried this by setting the table stats on the tables in my query to be large values using:

exec dbms_stats.set_table_stats(ownname => 'THE_SCHEMA',tabname => 'THE_TABLE',numrows => 3000000000,numblks => 24000000);

This seemed to work but I still wasn't overly keen on it since that means the optimizer is going to be coerced for any access to such tables - even if there isn't a join required.

My colleague Tank then came up with the idea that given most of our processes run off an "analysis date" which we store in a table, we could just create a materialized view of the contents of this table and set its stats to an artificially high value for numrows/blks and then given this table is used as the driver of most queries it would propagate through the plan and the optimiser, even applying heavy reductions for the probability would still realise that there were a lot of rows to process and choose plans accordingly....it worked a treat.

Labels: , ,


This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]