$deleted$ tablespace names bug

This one turned out to be a an interesting bug the other day…

I did a simple select from DBA_TAB_PARTITIONS and noticed that some tablespace_names were of the form “_$deleted$n$m” where n and m are numbers. Slightly worrying, but at least the data was all present and correct, when I checked. I knew the DBA team had been doing some reorganisations the previous weekend, to recover some space, so I wondered if that was connected….it was, and after opening an SR, the DBA, Phil, found an explanation (from Oracle Note: 604648.1) and a resolution.

Reproducing the issue and the way to fix it, is simple, using this script…

DROP TABLESPACE old_tbs INCLUDING CONTENTS AND DATAFILES;

CREATE TABLESPACE new_tbs 
DATAFILE 'C:APPORACLEORADATAT111NEW_TBS.DBF'
SIZE 100M
ONLINE;

CREATE TABLESPACE old_tbs 
DATAFILE 'C:APPORACLEORADATAT111OLD_TBS.DBF'
SIZE 100M
ONLINE;

SELECT ts#,name FROM sys.ts$ WHERE name LIKE '%TBS';

CREATE TABLE jeff_test(col1 DATE NOT NULL
                      ,col2 NUMBER NOT NULL
                      ,col3 VARCHAR2(200) NOT NULL
                      )
TABLESPACE old_tbs
PARTITION BY RANGE(col1)
SUBPARTITION BY LIST(col2)
SUBPARTITION TEMPLATE(
 SUBPARTITION "S1" VALUES(1)
,SUBPARTITION "S2" VALUES(2)
)
(PARTITION p1 VALUES LESS THAN(TO_DATE('31-DEC-2009','DD-MON-YYYY'))
,PARTITION p2 VALUES LESS THAN(TO_DATE('31-DEC-2010','DD-MON-YYYY'))
)
/

SELECT partition_name,tablespace_name FROM dba_tab_partitions WHERE table_name='JEFF_TEST';
SELECT subpartition_name,tablespace_name FROM dba_tab_subpartitions WHERE table_name='JEFF_TEST';

ALTER TABLE jeff_test MOVE SUBPARTITION p1_s1 TABLESPACE NEW_TBS;
ALTER TABLE jeff_test MOVE SUBPARTITION p1_s2 TABLESPACE NEW_TBS;
ALTER TABLE jeff_test MOVE SUBPARTITION p2_s1 TABLESPACE NEW_TBS;
ALTER TABLE jeff_test MOVE SUBPARTITION p2_s2 TABLESPACE NEW_TBS;

DROP TABLESPACE old_tbs INCLUDING CONTENTS AND DATAFILES;
ALTER TABLESPACE new_tbs RENAME TO old_tbs;

SELECT partition_name,tablespace_name FROM dba_tab_partitions WHERE table_name='JEFF_TEST';
SELECT subpartition_name,tablespace_name FROM dba_tab_subpartitions WHERE table_name='JEFF_TEST';

ALTER TABLE jeff_test MODIFY DEFAULT ATTRIBUTES FOR PARTITION p1 TABLESPACE old_tbs;
ALTER TABLE jeff_test MODIFY DEFAULT ATTRIBUTES FOR PARTITION p2 TABLESPACE old_tbs;

SELECT partition_name,tablespace_name FROM dba_tab_partitions WHERE table_name='JEFF_TEST';
SELECT subpartition_name,tablespace_name FROM dba_tab_subpartitions WHERE table_name='JEFF_TEST';

Which, when run in 11.1.0.6 on Windows 2003 Server 64 bit, gives:

DROP TABLESPACE old_tbs succeeded.
CREATE TABLESPACE succeeded.
CREATE TABLESPACE succeeded.
TS#                    NAME                           
---------------------- ------------------------------ 
9                      NEW_TBS                        
10                     OLD_TBS                        

2 rows selected

CREATE TABLE succeeded.
PARTITION_NAME                 TABLESPACE_NAME                
------------------------------ ------------------------------ 
P1                             OLD_TBS                        
P2                             OLD_TBS                        

2 rows selected

SUBPARTITION_NAME              TABLESPACE_NAME                
------------------------------ ------------------------------ 
P1_S2                          OLD_TBS                        
P1_S1                          OLD_TBS                        
P2_S2                          OLD_TBS                        
P2_S1                          OLD_TBS                        

4 rows selected

ALTER TABLE jeff_test succeeded.
ALTER TABLE jeff_test succeeded.
ALTER TABLE jeff_test succeeded.
ALTER TABLE jeff_test succeeded.
DROP TABLESPACE old_tbs succeeded.
ALTER TABLESPACE new_tbs succeeded.
PARTITION_NAME                 TABLESPACE_NAME                
------------------------------ ------------------------------ 
P1                             _$deleted$10$0                 
P2                             _$deleted$10$0                 

2 rows selected

SUBPARTITION_NAME              TABLESPACE_NAME                
------------------------------ ------------------------------ 
P1_S2                          OLD_TBS                        
P1_S1                          OLD_TBS                        
P2_S2                          OLD_TBS                        
P2_S1                          OLD_TBS                        

4 rows selected

ALTER TABLE jeff_test succeeded.
ALTER TABLE jeff_test succeeded.
PARTITION_NAME                 TABLESPACE_NAME                
------------------------------ ------------------------------ 
P1                             OLD_TBS                        
P2                             OLD_TBS                        

2 rows selected

SUBPARTITION_NAME              TABLESPACE_NAME                
------------------------------ ------------------------------ 
P1_S2                          OLD_TBS                        
P1_S1                          OLD_TBS                        
P2_S2                          OLD_TBS                        
P2_S1                          OLD_TBS                        

4 rows selected

Notice that the $n in “_$deleted$n$m” is 10, which is the ts# of the OLD_TBS before the rename. The problem revolves around entries in TS$, when you rename tablespaces to names that have previously been used and then dropped, basically because the old entries are not removed from TS$.

Related references:
Bug Numbers:8291493, itself a duplicate of 5769963
Note: 604648.1

According to the SR and bug, it was noticed in 10.2.0.4 and is fixed in 10.2.0.5. We’ve reproduced it in 11.1.0.6 on various ports, (results above) and updated our SR, so I guess the fix might also find it’s way into 11.1.0.7, perhaps.

No pruning for MIN/MAX of partition key column

Recently, I wanted to work out the maximum value of a column on a partitioned table. The column I wanted the maximum value for, happened to be the (single and only) partition key column. The table in question was range partitioned on this single key column, into monthly partitions for 2009, with data in all the partitions behind the current date, i.e. January through mid June were populated. There were no indexes on the table.

NOTE – I tried this on 10.2.04 (AIX) and 11.1.0 (Fedora 11) – the example below is from 11.1.0.

I’ll recreate the scenario here:

CREATE TABLESPACE tsp1
datafile '/u01/app/oracle/oradata/T111/tsp1.dbf' size 100M 
autoextend off extent management local  uniform size 1m segment space management auto online
/
CREATE TABLESPACE tsp2
datafile '/u01/app/oracle/oradata/T111/tsp2.dbf' size 100M 
autoextend off extent management local  uniform size 1m segment space management auto online
/

DROP TABLE test PURGE
/
CREATE TABLE test(col_date_part_key DATE NOT NULL
,col2 VARCHAR2(2000) NOT NULL
)
PARTITION BY RANGE(col_date_part_key)
(PARTITION month_01 VALUES LESS THAN (TO_DATE(’01-FEB-2009′,’DD-MON-YYYY’)) TABLESPACE tsp1
,PARTITION month_02 VALUES LESS THAN (TO_DATE(’01-MAR-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_03 VALUES LESS THAN (TO_DATE(’01-APR-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_04 VALUES LESS THAN (TO_DATE(’01-MAY-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_05 VALUES LESS THAN (TO_DATE(’01-JUN-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_06 VALUES LESS THAN (TO_DATE(’01-JUL-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_07 VALUES LESS THAN (TO_DATE(’01-AUG-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_08 VALUES LESS THAN (TO_DATE(’01-SEP-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_09 VALUES LESS THAN (TO_DATE(’01-OCT-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_10 VALUES LESS THAN (TO_DATE(’01-NOV-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_11 VALUES LESS THAN (TO_DATE(’01-DEC-2009′,’DD-MON-YYYY’)) TABLESPACE tsp2
,PARTITION month_12 VALUES LESS THAN (TO_DATE(’01-JAN-2010′,’DD-MON-YYYY’)) TABLESPACE tsp2
)
/
REM Insert rows, but only up to 14-JUN-2009
INSERT INTO test(col_date_part_key,col2)
SELECT TO_DATE(’31-DEC-2008′,’DD-MON-YYYY’) + l
, LPAD(‘X’,2000,’X’)
FROM (SELECT level l FROM dual CONNECT BY level < 166)
/
COMMIT
/
SELECT COUNT(*)
FROM test
/
SELECT MIN(col_date_part_key) min_date
, MAX(col_date_part_key) max_date
FROM test
/

This runs and gives the following output:

DROP TABLE test PURGE                                               
           *                                                        
ERROR at line 1:                                                    
ORA-00942: table or view does not exist

DROP TABLESPACE tsp1 INCLUDING CONTENTS
*
ERROR at line 1:
ORA-00959: tablespace ‘TSP1’ does not exist

DROP TABLESPACE tsp2 INCLUDING CONTENTS
*
ERROR at line 1:
ORA-00959: tablespace ‘TSP2’ does not exist

Tablespace created.

Tablespace created.

Table created.

165 rows created.

Commit complete.

COUNT(*)
———-
165

MIN_DATE MAX_DATE
——— ———
01-JAN-09 14-JUN-09

Now, lets see what the plan looks like from AUTOTRACE when we run the following query to get the maximum value of COL_DATE_PART_KEY:

SQL> SET AUTOTRACE ON
SQL> SELECT MAX(col_date_part_key) min_date
  2  FROM   test                           
  3  /

MIN_DATE
———
14-JUN-09

Execution Plan
———————————————————-
Plan hash value: 784602781

———————————————————————————————
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
———————————————————————————————
| 0 | SELECT STATEMENT | | 1 | 9 | 99 (0)| 00:00:02 | | |
| 1 | SORT AGGREGATE | | 1 | 9 | | | | |
| 2 | PARTITION RANGE ALL| | 132 | 1188 | 99 (0)| 00:00:02 | 1 | 12 |
| 3 | TABLE ACCESS FULL | TEST | 132 | 1188 | 99 (0)| 00:00:02 | 1 | 12 |
———————————————————————————————

Note
—–
– dynamic sampling used for this statement

Statistics
———————————————————-
0 recursive calls
0 db block gets
320 consistent gets
51 physical reads
0 redo size
527 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> SET AUTOTRACE OFF

It shows a full scan of all twelve partitions. I figured that the the plan for such a query would show a full table scan, of all partitions for that table – because, in theory, if all but the first partition were empty, then the whole table would have to be scanned to answer the query – and Oracle wouldn’t know at plan creation time, whether the data met this case, so it would have to do the full table scan to ensure the correct result.

What I thought might happen though, is that in executing the query, it would be able to short circuit things, by working through the partitions in order, from latest to earliest, and finding the first, non null, value. Once it found the first, non null, value, it would know not to continue looking in the earlier partitions, since the value of COL_DATE_PART_KEY couldn’t possibly be greater than the non null value already identified.

It doesn’t appear to have this capability, which we can check by taking one of the partitions offline and then rerunning the query, whereupon it complains that not all the data is present…

SQL> ALTER TABLESPACE tsp1 OFFLINE;

Tablespace altered.

SQL> SET AUTOTRACE ON
SQL> SELECT MAX(col_date_part_key) min_date
2 FROM test
3 /
SELECT MAX(col_date_part_key) min_date
*
ERROR at line 1:
ORA-00376: file 6 cannot be read at this time
ORA-01110: data file 6: ‘/u01/app/oracle/oradata/T111/tsp1.dbf’

SQL> SET AUTOTRACE OFF

So, even though we know we could actually answer this question accurately, Oracle can’t do it as it wants to scan, unnecessarily, the whole table.

I did find a thread which somebody had asked about this on OTN, but all the responses were about workarounds, rather than explaining why this happens (bug/feature) or how it can be made to work in the way I, or the poster of that thread, think it, perhaps, should.

Can anyone else shed any light on this? If it’s a feature, then it seems like something that could be easily coded more efficiently by Oracle. The same issue would affect both MIN and MAX since both could be
approached in the same manner.

Cursor keys not working in Virtual Server 2 VM

Posted as a reminder to myself about how to fix this issue…

I couldn’t get some of the cursor keys to work properly on my virtual machines running under VMWare Virtual Server 2 on Fedora 10 x86_64. Kept giving funny behaviour like bringing up the screen capture applet!

A bit of searching the net came up with this one, which although not referring to Virtual Server 2 specifically, seems to work all the same…

Essentially, adding the line below to the following file fixes the problem

File (create it, if not already present):

~/.vmware/config

Line:

xkeymap.nokeycodeMap = true

My thanks to “The Monkey Jungle”!

Creating a TPC-H schema with DBGEN on HP-UX

I wanted to try out this HammerOra product from Steve Shaw, both at work and on my box at home…but after playing with it at home, I realised that it takes quite some time to build even a small (scale factor 1) TPC-H schema…I know it runs serially, but I’m still not quite sure why it’s that slow (on my system that is), but Steve does say it can take a while and that you might wish to consider using the TPC utility DBGEN to generate and load the schema quicker…particularly if you use some manual parallelisation.

Given that I also need to use this tool to help with some benchmarking at work, I decided to try to get DBGEN to run on a HP-UX box today and had one or two problems which I managed to sort out. The machine in question is an RP8420 running HP-UX B.11.11.

DBGEN is a utility that allows you to create a series of flat files which contain the data for a TPC-H schema. You can then use SQL*Loader to load these into appropriately constructed tables in an Oracle database – any database actually…but I only care about Oracle of course 😉

The utility can be called with various parameters including making the target datasets in smaller “child” files which can be created in a manually parallelised fashion to speed the whole process up. You have to download the DBGEN reference data set from the TPC website (lower right hand side).

This reference data set contains the ANSI C source code which makes the DBGEN executable (and QGEN also…but more on that another day)…unfortunately it’s just the source code, so that means you need to compile it yourself…which of course leads to the first problem…that I know diddly squat about C…yeah I know, not very manly! Luckily I can sometimes follow instructions (which come with the reference data set)…

1. Copy makefile.suite to makefile
2. Edit makefile and make the following amendments (in red):

 

################
## CHANGE NAME OF ANSI COMPILER HERE
################
CC = gcc
# Current values for DATABASE are: INFORMIX, DB2, TDAT (Teradata)
# SQLSERVER, SYBASE
# Current values for MACHINE are: ATT, DOS, HP, IBM, ICL, MVS,
# SGI, SUN, U2200, VMS, LINUX, WIN32
# Current values for WORKLOAD are: TPCH
DATABASE= ORACLE
MACHINE = HP
WORKLOAD = TPCH

That’s it for the makefile.

Now, as I mentioned, the ORACLE database is not a listed database variant in the DBGEN C code – it’s got all the other popular RDBMS which I find quite bizarre…I’m sure there’s a reason, but I can’t think of one. To get around this, as per Chapter 5 in “Pro Oracle Database 10g RAC on Linux” by Steve Shaw and Julian Dyke, I added a section to the tpcd.h for the ORACLE database:

 
#ifdef ORACLE
#define GEN_QUERY_PLAN “”
#define START_TRAN “”
#define END_TRAN “”
#define SET_OUTPUT “”
#define SET_ROWCOUNT “”
#define SET_DBASE “”
#endif

I thought that was it – but it still would’t compile, giving the error:

config.h:213:2: #error Support for a 64-bit datatype is required in this release

Looking at the config.h – and bearing in mind I’m no C programmer – it struck me as odd that all bar the HP machine section, had stuff about DSS_HUGE and 64 bits…so I took a punt and copied some lines (in red) from the IBM section into the HP one to see if it worked…and it did. The HP section now looks like this:

 

#ifdef HP
#define _INCLUDE_POSIX_SOURCE
#define STDLIB_HAS_GETOPT
#define DSS_HUGE long long
#define HUGE_FORMAT “%lld”
#define HUGE_DATE_FORMAT “%02lld”
#define RNG_A 6364136223846793005ull
#define RNG_C 1ull
#endif /* HP */

Typing make at the command prompt then compiles the code and produces the dbgen executable…which I then spent a few hours playing with to create a scale factor 1 TPC-H set of files.

 

My next problem was one of my own making really in that I copied the CREATE TABLE statements for the TPC-H target tables from HammerOra’ TCL script for TPC-H creation, but unfortunately, the column ordering is slightly different in those DDL statements as compared to the DBGEN output files…which meant that I created the tables OK, but since I’d copied the column ordering to make the SQL*Loader control files, I got errors when I tried to load some of the files as the column order is different in one or two cases.

 

I then downloaded the TPC-H specification document which has, amongst other things, the data model, from which I cross checked the column ordering of the data model against the columns in the output files and then managed to rerun the data in without any further issues.

 

Tomorrow I’ll try running HammerOra against the target TPC-H schema and make some shell scripts to try and automate most of the process so we can build different scale factor schemae and do so in a manually parallelised fashion – scripts the amiable Scotsman created for his parallel testing a while back should give me a good start with that.

 

Problem with _gby_hash_aggregation_enabled parameter

Here’s a tale about an Oracle initialisation parameter…and a lesson we should all take note of…

For three days, my colleagues in the support team on one of the warehouses I’m involved in, were struggling with a piece of code which was exhausting the available temp space and after trying everything they could think of, they asked me to take a look. I must admit I was a little baffled at first because the piece of code in question had been happily running for some time now and every time I’d run it, I’d never noticed that TEMP was anywhere near being exhausted so, whilst I could tell the process had some kind of problem, I was in the dark as to exactly what that problem was.

After trying to break down the large query into several smaller steps, I realised that it was an early step in the query that was exhibiting the problem of temp exhaustion – the step being a pivot of around 300 million measure rows into a pivot target table.

This process runs monthly and it had run without issue on all of the previous 20 occurrences and for roughly the same, if not more rows to pivot…so it didn’t appear to be the volume that was causing the problem.

There had been no changes in the code itself for many months and the current version of the code had been run successfully in previous months, so it didn’t appear to be a code fault.

I obtained the actual execution path for the statement whilst it was running and it looked reasonable, although looking at it, triggered a thought in my mind…what if something had changed in the database configuration?

Why would I get that thought from looking at the execution path?

Well, a while back, we had received some advice that a line in our init.ora as follows, should be changed to set the feature to TRUE instead of FALSE, so that the feature became active:

_gby_hash_aggregation_enabled = FALSE

This results in a line in the plan that reads:

HASH GROUP BY

instead of

SORT GROUP BY

The parameter was set to FALSE due to a known bug and the issues we’d seen with it, however the recent advice we’d received, indicated that the bug had been resolved at the version level we were on and that by enabling the feature – which enables GROUP BY and Aggregation using a hash scheme – we’d gain a performance boost for certain queries.

So, the DBA team researched the advice and it appeared to be the case, that the bug (4604970) which led to the disabling of the feature was fixed at our version level (10.2.0.3 on HP-UX). We duly turned on the feature in a pre production environment and ran it for a while without noticing any issues. We then enabled it in production and again, for a while, we’ve not noticed any issues…until now.

After a check back through the logs, it appeared that since the parameter was first enabled, the queries which were now failing, had not been run at all…they had only run prior to the parameter change…so with my suspicions aroused further, I disabled the feature at the session level and reran the process. It completed in a normal time frame and used a small amount of TEMP – hooray!

So, now we have to go back to support to try and understand if the original bug is not quite fixed or whether this is a different scenario…in any event, we’re going to disable the feature for now, even though we’re only getting problems with the feature on 2 processes out of perhaps thousands.

So, what’s the lesson to learn?

Well, quite simply, that you need to have a thorough audit of what configuration changes you’ve made together with a good audit of the processes you’ve run so that you can work out what has changed since the last time you successfully ran a process. This gives you a fighting chance of spotting things like the above.

DBA_SEGMENTS misleading PARTITION_NAME column

Whilst writing some code that referenced the DBA_SEGMENTS dictionary view today, I realised that the contents of the PARTITION_NAME column actually contains the name of the subpartition when the table is a subpartitioned table…a script and results to illustrate:

drop table jeff_unpartitioned purge
/
drop table jeff_partitioned purge
/
drop table jeff_subpartitioned purge
/
create table jeff_unpartitioned(col1 number,col2 number)
/
create table jeff_partitioned(col1 number,col2 number)
partition by range(col1)
(partition p1 values less than(100)
,partition p2 values less than(maxvalue)
)
/
create table jeff_subpartitioned(col1 number,col2 number)
partition by range(col1)
subpartition by list(col2)
subpartition template
(subpartition sub1 values(1)
,subpartition sub2 values(2)
)
(partition p1 values less than(100)
,partition p2 values less than(maxvalue)
)
/
select segment_name,partition_name,segment_type
from   user_segments
where  segment_name like 'JEFF%'
order by segment_name
/

 

…gives the following results…

Table dropped.

Table dropped.


Table dropped.


Table created.


Table created.


Table created.


                        Partition
SEGMENT_NAME             Name                 SEGMENT_TYPE
------------------------ ------------------
JEFF_PARTITIONED         P2                   TABLE PARTITION
JEFF_PARTITIONED         P1                   TABLE PARTITION
JEFF_SUBPARTITIONED      P1_SUB1              TABLE SUBPARTITION
JEFF_SUBPARTITIONED      P2_SUB2              TABLE SUBPARTITION
JEFF_SUBPARTITIONED      P1_SUB2              TABLE SUBPARTITION
JEFF_SUBPARTITIONED      P2_SUB1              TABLE SUBPARTITION
JEFF_UNPARTITIONED                            TABLE

7 rows selected.

 

As you can see, the subpartitioned table shows the subpartition name in the PARTITION_NAME column. It’s not a big deal and I only noticed because I assumed there would be a SUBPARTITION_NAME column whilst I was writing my code…the failure in compilation led me to track this slightly erroneous situation down.

 

Why does this occur?

 

Well, if you delve into the code behind DBA_SEGMENTS you’ll see it refers to another view in the SYS schema called SYS_DBA_SEGS. The SQL behind SYS_DBA_SEGS selects all levels (Table, Partition and subpartitions) from the SYS.OBJ$ view, but then “loses” the partitions when joining to SYS.SYS_OBJECTS (the OBJ# from SYS.OBJ$ does not map to any row in SYS.SYS_OBJECTS via the OBJECT_ID column). The SQL then “loses” the table when joining to SYS.SEG$ – exactly why it does this I don’t fully understand, but I’m guessing it’s because those components of the composite object don’t actually have their own segment to physically store anything in since there are lower levels – in this case the subpartitions.

 

In any event, it’s a little bit of a gotcha and the column could probably do with being renamed to SUB_SEGMENT_NAME perhaps.

Partition exchange loading and ORA-14097

Continuing the theme of this post by Howard, I came across a scenario today which was resulting in this error:

ORA-14097 - column type or size mismatch in ALTER TABLE EXCHANGE PARTITION

I ran my checker script to try and identify exactly what the mismatch was but it came back with nothing. My script, whilst useful, isn’t perfect – and indeed there was an error in it (fixed now) which led to it not identifying the problem for this scenario – but given that it couldn’t find the problem, I had to manually look at all the attributes across the tables to try and identify a difference.

For a long time I was left perplexed because my script was suggesting that everything was OK and my script checks quite a few things now – what I wasn’t taking into account was that the script was wrong and that one of the things it was supposedly checking for, it was in fact overlooking.

In the end I found that a number of columns on the source table were NULLable whilst they were NOT NULL on the target. My script was supposed to be checking for this – which is why I was struggling to fix the problem for so long. After matching the nullability on both tables in the partition, the exchange ran through fine…but I guess my point would be, that the error message above doesn’t really convey the message that the problem might be a mismatch in the optionality of a column or columns on the tables involved.

Being a fan of fictional detective, Sherlock Holmes, I should have considered his position that “It is an old maxim of mine that when you have excluded the impossible, whatever remains, however improbable, must be the truth”…even if that truth is that your own script is at fault!

In attempting to investigate the problem I knocked up a simple script to demonstrate the problem and then the fix:

conn alpha/alpha@j4134

drop table source purge
/

create table source(pk_col1 number not null
,other_col1 number
,other_col2 date
,other_col3 varchar2(20)
)
/

create unique index test_pki on source(pk_col1)
/

alter table source add constraint test_pk primary key(pk_col1) using index
/

grant select on source to beta
/

insert into source(pk_col1,other_col1,other_col2,other_col3)
select l
, l
, trunc(sysdate+l)
, 'XXX'to_char(l)
from (select level l from dual connect by level < 1000)
/
commit
/

conn beta/beta@j4134

drop table target purge
/

create table target(pk_col1 number not null
,other_col1 number not null
,other_col2 date
,other_col3 varchar2(20)
)
partition by range(pk_col1)
(partition p1 values less than(1000)
,partition p2 values less than(2000)
)
/

create unique index test_pki on target(pk_col1) local
/

alter table target add constraint test_pk primary key(pk_col1) using index
/

alter table target exchange partition p1 with table alpha.source
/

conn alpha/alpha@j4134

alter table source modify(other_col1 number not null)
/

conn beta/beta@j4134

alter table target exchange partition p1 with table alpha.source
/

…and the results…

Connected.

Table dropped.


Table created.


Index created.


Table altered.


Grant succeeded.


999 rows created.


Commit complete.

Connected.

Table dropped.


Table created.


Index created.


Table altered.

alter table target exchange partition p1 with table alpha.source
*
ERROR at line 1:
ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION


Connected.

Table altered.

Connected.

Table altered.

Whilst in the example Howard gave, I think the issue revolved around the use of the word “shape” in error ORA-42016 and whether shape includes “data type” or not, error ORA-14097 seems to revolve around whether nullability is included under the phrase “column type” – I think both errors could do with being either slightly reworded or perhaps split out into separate errors which are more indicative of the true problem or problems at hand.

10gR2: OBJECT_NAME missing from V$SQL_PLAN for INSERT statement

I’d written a query the other day to try and identify SQL statements that had been run which wanted a certain degree of parallelism (DOP) and had either run with less than the requested amount of PX slaves or had run serially – in order that we could identify whether the (poor) performance of a query was potentially due to it not getting the resources (PX) we expected it to.

I posted the query on this entry here but have updated it since and it’s on this link now…in case I change it again. It needs more work as it currently doesn’t fully understand how many PX slaves will be used for a given piece of SQL, e.g. Sorts and Set operations (UNION etc…) all add to the requirements…I’m trying to determine the rules for this so I can improve the query in this respect.

Anyway, it seemed to work reasonably well – particularly at finding things that were supposed to run parallel but actually ran serially.

I did, however, notice that even on certain (INSERT) statements that didn’t involve Sorts/Set operations it was showing records where the required DOP (what I thought would be the maximum DOP it was going to request) was actually less than the achieved DOP – this didn’t make sense until I delved deeper and found that the OBJECT_NAME column on the V$SQL_PLAN seems to be NULL on the line relating to the INSERT operation in the plan – running an EXPLAIN PLAN for the statement showed the OBJECT_NAME correctly but only when it’s executed and the V$SQL_PLAN is checked did I find that this column appears to be, unexpectedly, NULL.

With the OBJECT_NAME being NULL it meant that my script was not including the DOP on the target table when determining the object with the highest DOP thereby sometimes getting an incorrect value and resulting in rows where the Required DOP is less than the Achieved DOP.

I created the obligatory test case to illustrate the point before contacting support to raise an SR (5686332.993). It turns out that this is all down to bug 5204810 which is (allegedly) fixed in 11g. The bug is a little unclear in that it only talks about this issue when dealing with conventional inserts and not Direct Path inserts which were those where I’d experienced the problem.

The bug suggests that 11g will add a new line to the execution plans such that it will look like this:


-----------------------------------------
Id Operation Name
-----------------------------------------
0 INSERT STATEMENT
1 LOAD TABLE CONVENTIONAL TABLE_X
-----------------------------------------


The ID 1 line will be visible in 11g the bug suggests and it will then start to show the OBJECT_NAME (TABLE_X in this instance) correctly on this new line rather than a NULL.

Oracle support suggested that although it wasn’t directly mentioned in this bug, it was still the same issue causing the problem for my Direct Path inserts…although I’m not entirely convinced since Direct Path inserts already have an extra line (LOAD AS SELECT) under the INSERT STATEMENT line in their plans…perhaps the fix, whilst including the line for LOAD TABLE CONVENTIONAL for conventional inserts, has also ensured that the OBJECT_NAME is correctly displayed when it’s a Direct Path INSERT and the Operation is LOAD AS SELECT…I’ll try to remember to test it when 11g becomes available.

V$SQL_PLAN bug in 10gR1

I’ve been working on trying to determine where long running queries are at and came across a problem today with V$SQL_PLAN on HP-UX 10gR1
(10.1.0.4.0) that I thought I’d share with you…

Essentially, I was trying to go into V$SQL_PLAN for the problem query, joining up the PARENT_ID and ID columns using CONNECT BY when I discovered that it wasn’t joining things up correctly. Further investigation showed that some of the entries in the table had PARENT_ID values which did not relate to any ID values and could therefore not be CONNECTed BY any means…I’ve since raised a TAR – sorry, Service Request (SR) on the super new whizzy Metalink and the response from Oracle was that it was indeed a bug.

I did some searching on Metalink and the forums and found that there have been problems with V$SQL_PLAN before where the support recommendation was that the problem was fixed in later releases and that a workaround was to just use EXPLAIN PLAN or:

select * from table(dbms_xplan.display_cursor(‘&sql;_id’))
/

For the given SQL_ID – if the plan is still in memory.

It’s always wise before going to Support with a problem to come up with a test case in my view…they will end up asking for one anyway and it’s perfectly reasonable to do so. Here’s the one that I logged with them:

drop table test1;
drop table test2;
create table test1(col1 number, col2 number,col3 number,col4 varchar2(20)); insert into test1 values(1000,2000,3000,’AAA’); insert into test1 values(1001,2000,3001,’BBB’); insert into test1 values(1002,2001,3000,’CCC’); create table test2(col5 number, col6 varchar2(20)); insert into test2 values(2000,’DDD’); insert into test2 values(2001,’EEE’); commit;

WITH i1 AS
(
select /*+ no_merge materialize */ t1.col1
,t1.col2
,t1.col3
,t1.col4
from test1 t1
, test2 t2
where t1.col2 = t2.col5
and t1.col3 = 3000
)
, i2 AS
(
select /*+ no_merge materialize */ t1.col1
,t1.col2
,t1.col3
,t1.col4
from test1 t1
, test2 t2
where t1.col2 = t2.col5
and t1.col3 = 3001
)
select i1.col1
, i1.col2
, i1.col3
, i1.col4
, i2.col4
from i1
, i2
where i1.col2 = i2.col2
/

select operation,id,parent_id,depth,position from v$sql_plan where sql_id=’fj03rbvgpqrp6′ order by id /

On 10gR1 on HP-UX I’m getting duff entries in the output:


OPERATION ID PARENT_ID DEPTH POSITION

SELECT STATEMENT 0 0 127
TEMP TABLE TRANSFORMATION 1 0 1 1
HASH JOIN 10 1 2 1
VIEW 11 10 3 1
TABLE ACCESS 12 11 4 1
VIEW 13 10 3 2
TABLE ACCESS 14 13 4 1
LOAD AS SELECT 15 1 2 2
HASH JOIN 16 2 3 1
TABLE ACCESS 17 3 4 1
TABLE ACCESS 18 3 4 2
LOAD AS SELECT 19 1 2 3
HASH JOIN 20 6 3 1
TABLE ACCESS 21 7 4 1
TABLE ACCESS 22 7 4 2

Notice that there are no ID’s 2,3,6 or 7 and therefore the PARENT_ID’s with those values are plain wrong.

Interestingly, on my 10gR2 on XP at home this worked fine so they probably fixed it in going to 10gR2 me thinks:




OPERATION ID PARENT_ID DEPTH POSITION

SELECT STATEMENT 0 0 14
TEMP TABLE TRANSFORMATION 1 0 1 1
LOAD AS SELECT 2 1 2 1
HASH JOIN 3 2 3 1
TABLE ACCESS 4 3 4 1
TABLE ACCESS 5 3 4 2
LOAD AS SELECT 6 1 2 2
HASH JOIN 7 6 3 1
TABLE ACCESS 8 7 4 1
TABLE ACCESS 9 7 4 2
HASH JOIN 10 1 2 3
VIEW 11 10 3 1
TABLE ACCESS 12 11 4 1
VIEW 13 10 3 2
TABLE ACCESS 14 13 4 1

The PARENT_ID and ID’s CONNECT up no problem now.

This is a bit of a pain cos I need it to work really for my “where is my query at” efforts….grrrr! Given it only fails in this way infrequently and for certain types of query (in my example where there are numerous TEMP TABLE TRANSFORMATIONs) I think I’ll just live with it as a feature in my script.