Running OBIEE Oracle By Example Tutorials against a database not called ORCL

I’ve been working with OBIEE for a while now, but I’ve not actually gone through the Oracle By Example tutorials, so I figured it would be a good idea to do that.

I started looking at the first Oracle By Example OBIEE tutorial yesterday and came across an issue with a simple solution to share with you.

The tutorials have a few caveats about what they expect from your environment, when you’re going to run through them – one of them being that you have access to a 10g database. What the prerequisites don’t specifically say is that unless your database is called ORCL, you’ll need to jump through a few more hoops – as I did, with my database called TEST.

I followed the instructions from the tutorial such that I had:

  1. An SH schema in the 10g database, with the standard tables and data Oracle supply.
  2. Created an ODBC Data Source pointing to a database called TEST, checking it functioned correctly.
  3. Restored the presentation catalog and updated the configuration files accordingly.

I then proceeded to login to the BI Dashboards which brought up half the display, but it was full of TNS errors indicating that the connection could not be made between the BI Server and the database where the SH schema resides:

BI Server TNS Errors

After some investigation in the Administration tool, I discovered that the Connection Pool setting was using a Data Source Name called “ORCL” which doesn’t match my TNS/Database called TEST, hence it couldn’t make the connection to the database:

ORCL Connection Pool

Now, the RPD was read only at the time, so I first shut down the services so it could be opened read/write:

BI services down

…logged into the Administration tool using Administrator user (Password Administrator), opened the SH.RPD file read/write and modified the Data Source Name in the Connection Pool from ORCL to TEST, whilst ensuring the password for the SH user matched that of my TEST database:

Change ORCL to TEST

…next I restarted the services:

BI services up

…and then logged on again, to find it all now worked:

TEST all working

No pruning for MIN/MAX of partition key column

Recently, I wanted to work out the maximum value of a column on a partitioned table. The column I wanted the maximum value for, happened to be the (single and only) partition key column. The table in question was range partitioned on this single key column, into monthly partitions for 2009, with data in all the partitions behind the current date, i.e. January through mid June were populated. There were no indexes on the table.

NOTE – I tried this on 10.2.04 (AIX) and 11.1.0 (Fedora 11) – the example below is from 11.1.0.

I’ll recreate the scenario here:

CREATE TABLESPACE tsp1
datafile '/u01/app/oracle/oradata/T111/tsp1.dbf' size 100M 
autoextend off extent management local  uniform size 1m segment space management auto online
/
CREATE TABLESPACE tsp2
datafile '/u01/app/oracle/oradata/T111/tsp2.dbf' size 100M 
autoextend off extent management local  uniform size 1m segment space management auto online
/

DROP TABLE test PURGE
/
CREATE TABLE test(col_date_part_key DATE NOT NULL
,col2 VARCHAR2(2000) NOT NULL
)
PARTITION BY RANGE(col_date_part_key)
(PARTITION month_01 VALUES LESS THAN (TO_DATE(’01-FEB-2009′,’DD-MON-YYYY’)) TABLESPACE tsp1
,PARTITION month_02 VALUES LESS THAN (TO_DATE(’01-MAR-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_03 VALUES LESS THAN (TO_DATE(’01-APR-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_04 VALUES LESS THAN (TO_DATE(’01-MAY-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_05 VALUES LESS THAN (TO_DATE(’01-JUN-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_06 VALUES LESS THAN (TO_DATE(’01-JUL-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_07 VALUES LESS THAN (TO_DATE(’01-AUG-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_08 VALUES LESS THAN (TO_DATE(’01-SEP-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_09 VALUES LESS THAN (TO_DATE(’01-OCT-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_10 VALUES LESS THAN (TO_DATE(’01-NOV-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_11 VALUES LESS THAN (TO_DATE(’01-DEC-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_12 VALUES LESS THAN (TO_DATE(’01-JAN-2010′,’DD-MON-YYYY’)) TABLESPACE tsp2
)
/
REM Insert rows, but only up to 14-JUN-2009
INSERT INTO test(col_date_part_key,col2)
SELECT TO_DATE(’31-DEC-2008′,’DD-MON-YYYY’) + l
, LPAD(‘X’,2000,’X’)
FROM (SELECT level l FROM dual CONNECT BY level < 166)
/
COMMIT
/
SELECT COUNT(*)
FROM test
/
SELECT MIN(col_date_part_key) min_date
, MAX(col_date_part_key) max_date
FROM test
/

This runs and gives the following output:

DROP TABLE test PURGE                                               
           *                                                        
ERROR at line 1:                                                    
ORA-00942: table or view does not exist

DROP TABLESPACE tsp1 INCLUDING CONTENTS
*
ERROR at line 1:
ORA-00959: tablespace ‘TSP1’ does not exist

DROP TABLESPACE tsp2 INCLUDING CONTENTS
*
ERROR at line 1:
ORA-00959: tablespace ‘TSP2’ does not exist

Tablespace created.

Tablespace created.

Table created.

165 rows created.

Commit complete.

COUNT(*)
———-
165

MIN_DATE MAX_DATE
——— ———
01-JAN-09 14-JUN-09

Now, lets see what the plan looks like from AUTOTRACE when we run the following query to get the maximum value of COL_DATE_PART_KEY:

SQL> SET AUTOTRACE ON
SQL> SELECT MAX(col_date_part_key) min_date
  2  FROM   test                           
  3  /

MIN_DATE
———
14-JUN-09

Execution Plan
———————————————————-
Plan hash value: 784602781

———————————————————————————————
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
———————————————————————————————
| 0 | SELECT STATEMENT | | 1 | 9 | 99 (0)| 00:00:02 | | |
| 1 | SORT AGGREGATE | | 1 | 9 | | | | |
| 2 | PARTITION RANGE ALL| | 132 | 1188 | 99 (0)| 00:00:02 | 1 | 12 |
| 3 | TABLE ACCESS FULL | TEST | 132 | 1188 | 99 (0)| 00:00:02 | 1 | 12 |
———————————————————————————————

Note
—–
– dynamic sampling used for this statement

Statistics
———————————————————-
0 recursive calls
0 db block gets
320 consistent gets
51 physical reads
0 redo size
527 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> SET AUTOTRACE OFF

It shows a full scan of all twelve partitions. I figured that the the plan for such a query would show a full table scan, of all partitions for that table – because, in theory, if all but the first partition were empty, then the whole table would have to be scanned to answer the query – and Oracle wouldn’t know at plan creation time, whether the data met this case, so it would have to do the full table scan to ensure the correct result.

What I thought might happen though, is that in executing the query, it would be able to short circuit things, by working through the partitions in order, from latest to earliest, and finding the first, non null, value. Once it found the first, non null, value, it would know not to continue looking in the earlier partitions, since the value of COL_DATE_PART_KEY couldn’t possibly be greater than the non null value already identified.

It doesn’t appear to have this capability, which we can check by taking one of the partitions offline and then rerunning the query, whereupon it complains that not all the data is present…

SQL> ALTER TABLESPACE tsp1 OFFLINE;

Tablespace altered.

SQL> SET AUTOTRACE ON
SQL> SELECT MAX(col_date_part_key) min_date
2 FROM test
3 /
SELECT MAX(col_date_part_key) min_date
*
ERROR at line 1:
ORA-00376: file 6 cannot be read at this time
ORA-01110: data file 6: ‘/u01/app/oracle/oradata/T111/tsp1.dbf’

SQL> SET AUTOTRACE OFF

So, even though we know we could actually answer this question accurately, Oracle can’t do it as it wants to scan, unnecessarily, the whole table.

I did find a thread which somebody had asked about this on OTN, but all the responses were about workarounds, rather than explaining why this happens (bug/feature) or how it can be made to work in the way I, or the poster of that thread, think it, perhaps, should.

Can anyone else shed any light on this? If it’s a feature, then it seems like something that could be easily coded more efficiently by Oracle. The same issue would affect both MIN and MAX since both could be
approached in the same manner.

TPC-H Query 20 and optimizer_dynamic_sampling

I was working with Jason Garforth today on creating a TPC-H benchmark script which we can run on our warehouse to initially get a baseline of performance, and then, from time to time, rerun it to ensure things are still running with a comparable performance level.

This activity was on our new warehouse platform of an IBM Power 6 p570 with 8 dual core 4.7GHz processors, 128GB RAM and a 1.6GB/Sec SAN.

Jason created a script to run the QGEN utility to generate the twenty two queries that make up the TPC-H benchmark and also a “run script” to then run those queries against the target schema I had created using some load scripts I talked about previously.

The whole process seemed to be running smoothly with queries running through in a matter of seconds, until query twenty went off the scale and started taking ages. Excluding the 20th query, everything else went through in about three to four minutes, but query twenty was going on for hours, with no sign of completing.

We grabbed the actual execution plan and noticed that all the tables involved had no stats gathered. In such circumstances, Oracle (10.2.0.4 in this instance) uses dynamic sampling to take a quick sample of the table in order to come up with an optimal plan for each query executed.

The database was running with the default value of 2 for optimizer_dynamic_sampling.

After reading the TPC-H specification, it doesn’t say that stats should or should not be gathered, but obviously in gathering them, there would be a cost to doing so and, depending on the method of gathering and the volume of the database, the cost could be considerable. It would be interesting to hear from someone who actually runs audited TPC-H benchmarks to know whether they gather table stats or whether they use dynamic sampling…

We decided we would gather the stats, just to see if the plan changed and the query executed any faster…it did, on both counts, with the query finishing very quickly, inline with the other twenty one queries in the suite.

So, our options then appeared to include, amongst other things:

  1. Gather the table stats. We’d proved this worked.
  2. Change the optimizer_dynamic_sampling level to a higher value and see if it made a difference.
  3. Manually, work out why the plan for the query was wrong, by analysis of the individual plan steps in further detail and then use hints or profiles to force the optimizer to “do the right thing”.

We decided to read a Full Disclosure report of a TPC-H benchmark for a similar system to see what they did. The FDR included a full listing of the init.ora of the database in that test. The listing showed that the system in question had set optimizer_dyamic_sampling to 3 instead of the default 2…we decided to try that approach and it worked perfectly.

In the end, given we’re not producing actual audited benchmarks then we’re free to wait for the gathering of optimizer stats, so we’ll go with that method, but it was interesting to see that option 2 above worked as well and illustrates the point that there is a lot of useful information to be gleaned from reading the FDRs of audited benchmarks – whilst, of course, being careful to read them with a pinch of salt, since they are not trying to run your system.

Another thing of interest was that in order to get the DBGEN utility to work on AIX 6.1 using the gcc compiler, we had to set an environment variable as follows otherwise we got an error when running DBGEN (also applies to QGEN too):

Set this:

export LDR_CNTRL=MAXDATA=0x80000000@LARGE_PAGE_DATA=Y

otherwise you may get this:

exec(): 0509-036 Cannot load program dbgen because of the following errors:
0509-026 System error: There is not enough memory available now.

Scripts for loading up DBGEN TPC-H data

If you’re interested in creating a TPC-H schema for testing purposes, then the following scripts may be of use to you:

Unix Scripts:
Multi Load TPCH
Load TPCH Stream

SQL*Loader Control files:
REGION Table
NATION Table
SUPPLIER Table
CUSTOMER Table
PART Table
PARTSUPP Table
ORDERS Table
LINEITEM Table

You may wish to read and check them before you use them – they’re not exactly rocket science but they seem to do the job.

I have all the files in the same directory for simplicity sake.

I then use them to create a scale factor 1 target TPC-H schema using the following calls:

# “l” loads the REGION and NATION tables
./multi_load_tpch.sh 1 l “tpch/tpch@test” 1

# “s” loads the SUPPLIER table
./multi_load_tpch.sh 1 s “tpch/tpch@test” 10

# “c” loads the CUSTOMER table
./multi_load_tpch.sh 1 c “tpch/tpch@test” 10

# “p” loads the PART and PARTSUPP tables
./multi_load_tpch.sh 1 p “tpch/tpch@test” 10

# “o” loads the ORDERS and LINEITEM tables
./multi_load_tpch.sh 1 o “tpch/tpch@test” 10

Obviously, you need to change the connection string to match your environment.

Caveats:

  1. Obviously, they are supplied as is – use at your own discretion and risk.
  2. You need to have created the target schema tables and made sure they are empty as the SQL*Loader control files use APPEND.
  3. Bear in mind that choosing too high a number of parallel streams (the last parameter in the calls) will overload your machine so try and balance it against the available system resources.

Bugs, issues or questions, please get in touch…enjoy.

Oracle Optimized Warehouse Initiative (OWI)

I enjoyed a trip out of the office today with my manager. We went down to the Oracle Enterprise Technology Centre, in Reading, to hear about the Oracle Optimized Warehouse Initiative. It was basically a half day pitch from Oracle and Sun today, although there are other dates with different hardware vendors (IBM, HP and Dell).

It was an interesting day, although I didn’t really hear anything new, per se. I think the main things I took away from the session were:

  • The availability of a set of reference configurations providing matrices which cover various permutations of user count, storage cost focus, warehouse size and hardware vendor.
  • The possibility of using the reference configurations either minimally, to simply cross check a proposed hardware specification for a given workload, to ensure it seems appropriate, or going the whole hog and using an “out of the box” reference configuration, delivered to your office, fully configured with all software installed, in a box, ready to run in your data.
  • Oracle are pushing RAC and ASM heavily in the DW space – no surprise there.
  • HammerOra and ORION are used by Oracle and the hardware vendors to assess the reference configurations…and there is nothing stopping you using them for your own benchmarking efforts

It was interesting to hear about the Proof Of Concept facility that Sun has, in Linlithgow, Scotland. The facility allows Sun (and an Oracle customer) to take a server off the production line and, working with the customer, test their workload on that machine to see if it’s going to work or not. Neat, and since we’re going to be using some Sun kit for a DW shortly, it sounds like an opportunity.

Funniest thing of the day for me, was the last slide in the pitch given by Mike Leigh of Sun which had the title “The Power of 2” and was illustrating the benefits to customers of Oracle and Sun as a united force. I didn’t really take much notice, as I was too busy smiling, as I looked at the title and it made me think of Doug and his Parallel Execution and the ‘Magic of 2’ paper (the Magic of 2 bit actually being from this paper by Cary).

If you’re building a warehouse, or just want to get an idea of whether your hardware is appropriate for the job, it’s probably worth reading up on the OWI.

Is ASSM evil in a DW?

I saw an interesting article on ASSM from Howard the other day. I wanted to comment on it but unless I’m being a bit thick, it doesn’t seem to be a blogpost and therefore I couldn’t and instead, I thought I’d do it here.

As I said, I found the article interesting and informative but I did have a few “But what about…?” issues that popped into my head as I read through it.

It probably won’t surprise anyone who reads my blog that my queries relate to the use of ASSM on a datawarehouse environment given my interest in that area.

Firstly, Howard asserted that ASSM wouldn’t be helpful in scenarios where you recieved sorted data from a warehouse source which you wanted to load up into a target table in order that you could gain a performance benefit from creating an index on that data whilst taking advantage of the NOSORT option – thereby speeding up the index creation considerably. Nothing Howard said was untrue although there are a number of things that could be discussed in relation to this point…

Firstly, in my experience, most warehouse processing uses Oracle parallel execution for performance – but if one were to load up data into a target table using parallel processing, then even if it were sorted in the source, the use of parallel processing would result in the target being unsorted and hence this advantage would be lost. Of course you could resort to serial processing but would that be more efficient than doing the work in parallel and then sorting the data for the index build again using parallel processing techniques – each process is probably different but I guess what I’m saying is that it’s possible that the NOSORT issue is not necessarily relevant in all cases especially those on a DW using parallel execution.

There are a number of caveats with the use of the NOSORT option such as:

You cannot specify REVERSE with this clause.
You cannot use this clause to create a cluster index partitioned or bitmap index.
You cannot specify this clause for a secondary index on an index-organized table.

Read more here

The use of Direct Path operations on a DW might also mean that the space wastage doesn’t occur given Oracle constructs full blocks from clean ones rather than via free list management or the ASSM equivalent.

Howard suggested that tests he has undertaken showed a 1.5% excess space utilisation when ASSM was involved on a 1Tb table. A long time ago I would have been horified to lose 15Gb but these days I probably wouldn’t be as concerned, particularly as I often find ways to save that kind of storage just as easily in other areas, e.g. Ensuring we use compression appropriately for tables and indexes and ensuring we only have indexes we need and which are of the right type and structure. If the solution to save that 1.5% were simple then I’d consider that too – not using ASSM sounds like a simple solution but I guess it depends on what the DBA standards are for the organisation amongst other factors.

Howard mentioned that with ASSM “wasting” space and using more blocks to store the same number of rows, that would mean a full scan would need to read more blocks for a given table and that the numerous, repeatedly accessed and therefore “hot”, ASSM bitmap blocks could lead to other, “warm” data being aged out sooner than would otherwise be the case – It’s a reasonable hypothesis and if we were full scanning a 1Tb table that had 1.5% or 15Gb of “waste” – I’m making a big assumption due to my lack of research here, in guessing that the 15Gb is largely made up of these ASSM bitmap blocks – then yes, 15Gb of blocks could have a considerable impact on a buffer cache…but that’s assuming you’d be reading the whole 1Tb table via a full scan – one would imagine the table is partitioned by some reasonable key (hopefully including time) and that therefore you’d only be reading a much smaller subset with the consequent reduced effect on the buffer cache. The effect is there but it’s impact would be system/query dependent.

Since the read (full scan) of the table itself would result in blocks going on the LRU end of the LRU list for the buffer cache then that wouldn’t affect the caching of hot/warm data at all – it’s just a matter of whether the ASSM bitmap blocks would be aging out other “warm” data to the detriment of overall system performance.

Whilst it raised a number of queries, it’s definitely the case that the article Howard posted has raised quite a debate within the DBA team on the warehouse I’m currently working on…should be fun this week on the warehouse as we contemplate peforming a number of benchmarks to try and ascertain which way we should jump – if any!

On another note, even though he’s moved on to pastures new, I think Doug Burns must have had something to do with the latest Sky TV adverts as the phone number they wanted you to call was 08702 42 42 42 – now that’s what I call leaving a lasting impression!

Good to be back in the 21st Century…and identifying PEL mismatches

Well, I can honestly say, that moving house over the last few months has been an experience I don’t wish to repeat for a good few years – if not decades! I’ve only just got my broadband enabled for the new place – hence my first post in ages – and that was a miracle given the state of what they very loosely call “customer service” at BT.

A few things I’ve learned in the process of moving house:

  • Never choose BT to assist you with any communications infrastructure
  • – phone line, VOIP, Broadband etc…the list of problems I’ve encountered whilst getting the simple task of telephone line and broadband installed has been literally staggering.
  • Never move in with your in laws when you are “between houses”…unless they are a nice bunch of people in which case for a short period of time (we spent 5 weeks there) it can be quite workable, even if there were some opinions at polar extremes to negotiate around at times – how exactly do you get a pro hunting person and a vegetarian to live happily under one roof?! Thankfully the Christmas spirit got us all through it and we’re happily esconced in or new place.
  • Don’t even think about using BT to get a phone line installed – it will take at least 3 weeks and they will probably lose your order at least twice.
  • Never trust Estate Agents – ours misinformed us about one particular aspect of our purchase and very nearly stopped it happening because of it – miscommunication is a highly dangerous thing.
  • Never ask BT to install your broadband – I won’t go into massive details but basically they lost my first order and then handled my second one in an inept, dismissive and unprofessional manner…and believe me, if you’ve ever tried to use their IVR system to get through to the right person you’ll understand just how frustrating it can be to be passed from pillar to post, explain your situation for five minutes only to be told that you have been talking to the wrong department (Exactly how I managed to get through to a Residential Sales person when I called a Business Sales line is quite beyond me!)
  • Take more care when you pack stuff away…otherwise you’ll end up turning the new house upside down looking for things you can’t find only to find they are tucked away in an incorrectly marked box…it would have helped if we’d been more involved with the packing people from the removals company in hindsight.

Now, I’m not going to just post a social thing as I quite understand the points being made recently on other blogs about “Sorry I’ve not posted in ages but…”, so I thought I’d say something about problems when you’re partition exchange loading. I built this routine a while back to help diagnose why we were unable to exchange partitions during a Partition Exchange Load process – it doesn’t check every scenario but does cover a number of common ones such as:

  • IOT’s with different overflow characteristics can’t be exchanged
  • Hakan factors must be the same
  • Columns must be the same – in my view, that includes column names – this script will find issues like this one – even though it won’t actually stop the PEL.
  • Constraints must be the same
  • Indexes must be the same – at least if they do if you are INCLUDING INDEXES in the PEL operation.

I’m sure it’s not perfect but if anyone spots any issues with it then feel free to let me know and I’ll check and fix the code.

We’ve found it quite useful – your mileage, as they say, may vary.

There are a number of links on this blog site that were pointing at my old web hosting site (www.oramoss.demon.co.uk) – that’s dead now – I made the mistake of cancelling that and moving to BT – I mentioned that right? Anyway, I’ve fixed some of the posts and the presentation links – I’ll fix anything else as I come across it.

Partition exchange loading and column transposing issue

I came across an interesting issue yesterday whereby a partition exchange load routine wasn’t working for my colleague Jon. The ensuing investigation seems to point at the possibility of a bug, whereby after exchanging a partition Oracle may have inadvertently transposed columns in the target table. I’ve created a test script to illustrate the problem and logged an SR with Metalink to see what they make of it.

In light of my recent post on DBMS_APPLICATION_INFO I’d better state that we found the problem on 10.2.0.2 on HP-UX and then tested it against 9.2.0.6 on HP-UX and found the same results so it seems the issue has been there a while in terms of Oracle releases.

The investigation started down a different avenue – as they often do – with the inability to complete a partition exchange due to the Oracle error:

ORA-14098: index mismatch for tables in ALTER TABLE EXCHANGE PARTITION

Reissuing the exchange partition command with the EXCLUDING INDEXES clause allowed the exchange to proceed so we figured there was definitely a problem with the indexes.

Next we tried dropping all the indexes on both sides of the exchange and recreating them…but when we retried the exchange partition it failed again with the same error.

Right, drastic measures time…we dropped all the indexes on both sides and ran the exchange which, with no indexes around, worked fine (with INCLUDING INDEXES clause in place as it would normally be on this process).

Next we proceeded to add the indexes one by one, starting with the b trees for the primary key and unique keys. This worked without a problem so we moved on to the bitmap indexes which seemed to be fine for the first few but then after adding a few of them it suddenly stopped working. Through a process of trial and error we were able to determine that if two specific bitmap indexes were on the tables then the exchange would fail with ORA-14098.

So, what was so special about those two bitmaps and the columns they were on. We selected the details from dba_tab_columns for the two columns involved in these bitmap indexes and realised that the only difference was that they were at a different column position (COLUMN_ID) on the tables – which begged the question “If these two tables are not quite the same then how can we be allowed to exchange them?”

Well, I guess, in our case, we were able to say that the two objects being exchanged had the same columns but just not necessarily in the same order (those of you who love classic comedy will I’m sure be recalling Morecambe and Wise with Andre Previn right about now)…should this mean we can swap the tables or not ?

To try and illustrate the scenario I built a test script…the output from a run follows…sorry it’s a bit long…but it does prove some useful conclusions (I think)

> @test_pel_bug_mismatch_columns
> REM Drop the tables…
> DROP TABLE jeff_test
/

Table dropped.

> DROP TABLE jeff_test_ptn
/

Table dropped.

> REM Create the tables…
> CREATE TABLE jeff_test_ptn(partition_key_col NUMBER NOT NULL
2 ,column1 NUMBER NOT NULL
3 ,column3 NUMBER NOT NULL
4 ,column2 NUMBER NOT NULL
5 ,column4 NUMBER NOT NULL
6 )
7 PARTITION BY RANGE (partition_key_col)
8 (
9 PARTITION p001 VALUES LESS THAN (1000001)
10 ,PARTITION p002 VALUES LESS THAN (MAXVALUE)
11 )
12 /

Table created.

> CREATE TABLE jeff_test(partition_key_col NUMBER NOT NULL
2 ,column1 NUMBER NOT NULL
3 ,column2 NUMBER NOT NULL
4 ,column3 NUMBER NOT NULL
5 ,column4 NUMBER NOT NULL
6 )
7 /

Table created.

> REM Populate the table…
> INSERT /*+ APPEND */
2 INTO jeff_test_ptn(partition_key_col
3 ,column1
4 ,column2
5 ,column3
6 ,column4
7 )
8 SELECT idx
9 , MOD(idx,2)
10 , MOD(idx,3)
11 , MOD(idx,4)
12 , MOD(idx,2)
13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL
COMMIT
2 /

Commit complete.

> INSERT /*+ APPEND */
2 INTO jeff_test(partition_key_col
3 ,column1
4 ,column2
5 ,column3
6 ,column4
7 )
8 SELECT idx
9 , MOD(idx,2)
10 , MOD(idx,5)
11 , MOD(idx,6)
12 , MOD(idx,2)
13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL
COMMIT
2 /

Commit complete.

> REM Count the rows in COLUMN2 and COLUMN3 for both tables…
>
> REM Should see 3 rows…all NUMBERS
> SELECT column2,COUNT(1) FROM jeff_test_ptn GROUP BY column2;

COLUMN2 COUNT(1)
———- ———-
1 12000
2 12000
0 12000

3 rows selected.

>
> REM Should see 4 rows…all NUMBERS
> SELECT column3,COUNT(1) FROM jeff_test_ptn GROUP BY column3;

COLUMN3 COUNT(1)
———- ———-
1 9000
2 9000
3 9000
0 9000

4 rows selected.

>
> REM Should see 5 rows…all NUMBERS
> SELECT column2,COUNT(1) FROM jeff_test GROUP BY column2;

COLUMN2 COUNT(1)
———- ———-
1 7200
2 7200
4 7200
3 7200
0 7200

5 rows selected.

>
> REM Should see 6 rows…all NUMBERS
> SELECT column3,COUNT(1) FROM jeff_test GROUP BY column3;

COLUMN3 COUNT(1)
———- ———-
1 6000
2 6000
4 6000
5 6000
3 6000
0 6000

6 rows selected.

>
> REM Now lets try and swap the partition and the table…
> ALTER TABLE jeff_test_ptn
2 EXCHANGE PARTITION p001
3 WITH TABLE jeff_test
4 INCLUDING INDEXES
5 WITHOUT VALIDATION
6 /

Table altered.

>
> REM Surprisingly, it lets us do the above operation without complaining.
> REM Even worse…it transposes the values in COLUMN2 and COLUMN3…
>
> REM Should see 5 rows…but we see 6 rows
> SELECT column2,COUNT(1) FROM jeff_test_ptn GROUP BY column2;

COLUMN2 COUNT(1)
———- ———-
1 6000
2 6000
4 6000
5 6000
3 6000
0 6000

6 rows selected.

>
> REM Should see 6 rows…but we see 5 rows
> SELECT column3,COUNT(1) FROM jeff_test_ptn GROUP BY column3;

COLUMN3 COUNT(1)
———- ———-
1 7200
2 7200
4 7200
3 7200
0 7200

5 rows selected.

>
> REM Should see 3 rows…but we see 4 rows
> SELECT column2,COUNT(1) FROM jeff_test GROUP BY column2;

COLUMN2 COUNT(1)
———- ———-
1 9000
2 9000
3 9000
0 9000

4 rows selected.

>
> REM Should see 4 rows…but we see 3 rows
> SELECT column3,COUNT(1) FROM jeff_test GROUP BY column3;

COLUMN3 COUNT(1)———- ———-
1 12000
2 12000
0 12000

3 rows selected.

>
> REM Now, lets try again but with COLUMN2 and COLUMN3
> REM being of different datatypes…
>
> REM Drop the tables…
> DROP TABLE jeff_test
2 /

Table dropped.

> DROP TABLE jeff_test_ptn
2 /

Table dropped.

> REM Create the tables…
> CREATE TABLE jeff_test_ptn(partition_key_col NUMBER NOT NULL
2 ,column1 NUMBER NOT NULL
3 ,column3 NUMBER NOT NULL
4 ,column2 DATE NOT NULL
5 ,column4 NUMBER NOT NULL
6 )
7 PARTITION BY RANGE (partition_key_col)
8 (
9 PARTITION p001 VALUES LESS THAN (1000001)
10 ,PARTITION p002 VALUES LESS THAN (MAXVALUE)
11 )
12 /

Table created.

> CREATE TABLE jeff_test(partition_key_col NUMBER NOT NULL
2 ,column1 NUMBER NOT NULL
3 ,column2 DATE NOT NULL
4 ,column3 NUMBER NOT NULL
5 ,column4 NUMBER NOT NULL
6 )
7 /

Table created.

> REM Populate the table…
> INSERT /*+ APPEND */
2 INTO jeff_test_ptn(partition_key_col
3 ,column1
4 ,column2
5 ,column3
6 ,column4
7 )
8 SELECT idx
9 , MOD(idx,2)
10 , SYSDATE + MOD(idx,3)
11 , MOD(idx,4)
12 , MOD(idx,2)
13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL
COMMIT
2 /

Commit complete.

> INSERT /*+ APPEND */
2 INTO jeff_test(partition_key_col
3 ,column1
4 ,column2
5 ,column3
6 ,column4
7 )
8 SELECT idx
9 , MOD(idx,2)
10 , SYSDATE + MOD(idx,5)
11 , MOD(idx,6)
12 , MOD(idx,2)
13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL
COMMIT
2 /

Commit complete.

> REM Count the rows in COLUMN2 and COLUMN3 for both tables…
>
> REM Should see 3 rows…all DATES
> SELECT column2,COUNT(1) FROM jeff_test_ptn GROUP BY column2;

COLUMN2 COUNT(1)
——————– ———-
27-SEP-2006 11:49:03 12000
28-SEP-2006 11:49:03 12000
26-SEP-2006 11:49:03 12000

3 rows selected.

>
> REM Should see 4 rows…all NUMBERS
> SELECT column3,COUNT(1) FROM jeff_test_ptn GROUP BY column3;

COLUMN3 COUNT(1)
———- ———-
1 9000
2 9000
3 9000
0 9000

4 rows selected.

>
> REM Should see 5 rows…all DATES
> SELECT column2,COUNT(1) FROM jeff_test GROUP BY column2;

COLUMN2 COUNT(1)
——————– ———-
30-SEP-2006 11:49:03 7200
27-SEP-2006 11:49:03 7200
28-SEP-2006 11:49:03 7200
29-SEP-2006 11:49:03 7200
26-SEP-2006 11:49:03 7200

5 rows selected.

>
> REM Should see 6 rows…all NUMBERS
> SELECT column3,COUNT(1) FROM jeff_test GROUP BY column3;

COLUMN3 COUNT(1)
———- ———-
1 6000
2 6000
4 6000
5 6000
3 6000
0 6000

6 rows selected.

>
> REM Now lets try and swap the partition and the table…
> REM It will fail with error…
> REM ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION
>
> ALTER TABLE jeff_test_ptn
2 EXCHANGE PARTITION p001
3 WITH TABLE jeff_test
4 INCLUDING INDEXES
5 WITHOUT VALIDATION
6 /
ALTER TABLE jeff_test_ptn
*
ERROR at line 1:
ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION

>
> REM So, it only fails when the columns have the same datatypes.
>
> REM Now, lets say they are the same datatype and look at how bitmap indexes
> REM are affected in the PARTITION EXCHANGE process…
>
> REM First lets recreate the tables with COLUMN2 and COLUMN3
> REM having the same datatype…
> REM Drop the tables…
> DROP TABLE jeff_test
2 /

Table dropped.

> DROP TABLE jeff_test_ptn
2 /

Table dropped.

> REM Create the tables…
> CREATE TABLE jeff_test_ptn(partition_key_col NUMBER NOT NULL
2 ,column1 NUMBER NOT NULL
3 ,column3 NUMBER NOT NULL
4 ,column2 NUMBER NOT NULL
5 ,column4 NUMBER NOT NULL
6 )
7 PARTITION BY RANGE (partition_key_col)
8 (
9 PARTITION p001 VALUES LESS THAN (1000001)
10 ,PARTITION p002 VALUES LESS THAN (MAXVALUE)
11 )
12 /

Table created.

> CREATE TABLE jeff_test(partition_key_col NUMBER NOT NULL
2 ,column1 NUMBER NOT NULL
3 ,column2 NUMBER NOT NULL
4 ,column3 NUMBER NOT NULL
5 ,column4 NUMBER NOT NULL
6 )
7 /

Table created.

> REM Populate the table…
> INSERT /*+ APPEND */
2 INTO jeff_test_ptn(partition_key_col
3 ,column1
4 ,column2
5 ,column3
6 ,column4
7 )
8 SELECT idx
9 , MOD(idx,2)
10 , MOD(idx,3)
11 , MOD(idx,4)
12 , MOD(idx,2)
13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL
COMMIT
2 /

Commit complete.

> INSERT /*+ APPEND */
2 INTO jeff_test(partition_key_col
3 ,column1
4 ,column2
5 ,column3
6 ,column4
7 )
8 SELECT idx
9 , MOD(idx,2)
10 , MOD(idx,5)
11 , MOD(idx,6)
12 , MOD(idx,2)
13 FROM (SELECT LEVEL idx FROM dual CONNECT BY LEVEL
COMMIT
2 /

Commit complete.

>
> REM Now lets create a bitmap index on COLUMN1 on both tables…
> CREATE BITMAP INDEX jtp1 ON jeff_test_ptn(column1) LOCAL
2 /

Index created.

> CREATE BITMAP INDEX jt1 ON jeff_test(column1)
2 /

Index created.

>
> REM …and now try PARTITION EXCHANGE…
> ALTER TABLE jeff_test_ptn
2 EXCHANGE PARTITION p001
3 WITH TABLE jeff_test
4 INCLUDING INDEXES
5 WITHOUT VALIDATION
6 /

Table altered.

> REM It works fine.
>
> REM Now lets create a bitmap index on COLUMN4 on both tables…
> CREATE BITMAP INDEX jtp4 ON jeff_test_ptn(column4) LOCAL
2 /

Index created.

> CREATE BITMAP INDEX jt4 ON jeff_test(column4)
2 /

Index created.

>
> REM …and now try PARTITION EXCHANGE…
> ALTER TABLE jeff_test_ptn
2 EXCHANGE PARTITION p001
3 WITH TABLE jeff_test
4 INCLUDING INDEXES
5 WITHOUT VALIDATION
6 /

Table altered.

>
> REM It works fine.
>
> REM Now lets create a bitmap index on COLUMN2 on both tables…
> CREATE BITMAP INDEX jtp2 ON jeff_test_ptn(column2) LOCAL
2 /

Index created.

> CREATE BITMAP INDEX jt2 ON jeff_test(column2)
2 /

Index created.

>
> REM …and now try PARTITION EXCHANGE…
> ALTER TABLE jeff_test_ptn
2 EXCHANGE PARTITION p001
3 WITH TABLE jeff_test
4 INCLUDING INDEXES
5 WITHOUT VALIDATION
6 /
WITH TABLE jeff_test
*
ERROR at line 3:
ORA-14098: index mismatch for tables in ALTER TABLE EXCHANGE PARTITION

Right, now what can we conclude from the above ?

1. If you don’t have indexes on your table then you can do an exchange even if the columns are in a different order – but you will silently transpose the columns. No errors are given and I think this is a bug.
2. You will only hit the problem in 1 if the columns which are transposed are of the same datatype – I’ve not checked for scale/precision/length – it may be, for example, that a VARCHAR2(20) and VARCHAR2(10) being transposed would raise an ORA-14097 error.
3. If you have indexes on columns which are transposed then the exchange will fail with ORA-14098. I don’t know whether this is a bitmap index specific thing as I’ve not tested it any further.
4. If you only have indexes on columns which are not transposed then you can do the exchange and there will be no errors – but your data is obviously transposed silently.

TRUNCATE command marks previously UNUSABLE indexes as USABLE

Yes, I’m back…but on an adhoc basis as I still don’t have enough time for this really! Yeah – I know, you’re all in the same boat 😉

I came across this problem on the warehouse the other day and was slightly surprised by what was going on…

We have an OWB mapping which does a TRUNCATE and then INSERT APPEND into a target table. The target table has some bitmap indexes on it. We run a little preprocess before calling the mapping to mark the bitmaps as unusable so it doesn’t maintain them and then we (hope to) get better performance during the INSERT APPEND…we then rebuild the indexes afterwards more efficiently than maintaining them on the fly…sounds simple enough…except that performance was dragging…so I had a quick shufty and observed that the indexes were still marked as USABLE. I asked the guy running the stuff to check the run script and he said “No, we marked them as UNUSABLE before we ran the mapping – I’m sure of it”.

That’s strange I thought…time for some debug…which proved that we were setting the indexes UNUSABLE but somehow by the time the mapping was running they had miraculously been marked as USABLE.

I decided to create a simple test script to run in SQL*Plus which would be doing the same things as the OWB mapping – to try and repeat it at the SQL level and also to prove/disprove whether OWB was having some affect.

It’s a useful thing to do this in any case as it’s helpful if you’re going to end up:

A) Testing it on other platforms/versions
B) Sending it to Oracle Support via an SR.

(Excuse formatting – either Blogger is crap or I don’t understand how to get this to look nice -take your pick)


drop table j4134_test_bmi;

create table j4134_test_bmi(col1 number not null
,col2 number not null);

create unique index jtb_pk_i on j4134_test_bmi(col1);

alter table j4134_test_bmi add constraint jtb_pk primary key(col1) using index;

create bitmap index jtb_bmi on j4134_test_bmi(col2);

select index_name,index_type,status,partitioned from user_indexes where table_name=’J4134_TEST_BMI’;

alter index jtb_bmi unusable;

alter table j4134_test_bmi disable constraint jtb_pk;

alter index jtb_pk_i unusable;

select index_name,index_type,status,partitioned from user_indexes where table_name=’J4134_TEST_BMI’;

truncate table j4134_test_bmi;

select index_name,index_type,status,partitioned from user_indexes where table_name=’J4134_TEST_BMI’;

insert /*+ append */ into j4134_test_bmi(col1,col2)
select rownum
, 100
from (select level l from dual connect by level


commit;

select index_name,index_type,status,partitioned from user_indexes where table_name=’J4134_TEST_BMI’;

A section of the output of the script follows:

Index IndexName Type STATUS PAR
—————————— —— ——– —
JTB_PK_I NORMAL UNUSABLE NO
JTB_BMI BITMAP UNUSABLE NO

2 rows selected.

Elapsed: 00:00:00.06

Table truncated.

Elapsed: 00:00:00.11

Index IndexName Type STATUS PAR
—————————— —— ——– —
JTB_PK_I NORMAL VALID NO
JTB_BMI BITMAP VALID NO

Which shows that immediately after the TRUNCATE command was issued, the previously UNUSABLE indexes were suddenly marked as USABLE!

Why ? It’s not documented anywhere that this would happen…so I created an SR with Oracle Support who initially created a Bug for it…but Development came back to say that it is “Expected behaviour”. The reasoning Support give for this being expected is that when the Table is truncated, the associated Index segments are also truncated…and if they have no data in them then they can’t possibly be UNUSABLE so they get marked as USABLE as a side effect of the process.

It’s actually a reasonable assumption for them (Oracle development) to make to be fair – but it would have helped if it was documented – they’re going to create a Metalink support note to document the issue.

Thanks to Doug for pushing me to blog this – I’ll have to buy you a drink at the OUG Scotland Conference 2006!

RAC and data warehouses – Are they compatible ?

I watched one of the presentations of the Audio CD from UKOUG Conference 2005 – Getting the most of of RAC on Linux by James Anthony and was interested to see his comments that reducing the block size / reducing the rows per block can help in tuning the interconnect between the instances of a RAC environment.

Basically, James was saying how reducing the number of rows per block, perhaps by reducing the block size or using a higher PCTFREE value, would help to reduce block contention at the cost of using more space. This would in turn make full scans slightly worse but he suggested that full table scans can be hard on a RAC environment anyway and are therefore best avoided….so I’m wondering how a RAC approach fits with a data warehouse, where full scans are reasonably common (albeit with some degree of partition pruning hopefully!) and also where we might use compression, PCTFREE 0 and large block sizes to maximise the rows per block.

It sounds like there might be elements of RAC that don’t fit well with warehousing – but I’m guessing it’s not necessarily black and white and that things can be designed/planned/managed appropriately.

We don’t use RAC on the system I’m working on currently and I’m not particularly experienced with RAC, so if anyone has any opinions I’d be interested to hear them.