Wednesday, July 30, 2008
Scripts for loading up DBGEN TPC-H data
Unix Scripts:
Multi Load TPCH
Load TPCH Stream
SQL*Loader Control files:
REGION Table
NATION Table
SUPPLIER Table
CUSTOMER Table
PART Table
PARTSUPP Table
ORDERS Table
LINEITEM Table
You may wish to read and check them before you use them - they're not exactly rocket science but they seem to do the job.
I have all the files in the same directory for simplicity sake.
I then use them to create a scale factor 1 target TPC-H schema using the following calls:
# "l" loads the REGION and NATION tables
./multi_load_tpch.sh 1 l "tpch/tpch@test" 1
# "s" loads the SUPPLIER table
./multi_load_tpch.sh 1 s "tpch/tpch@test" 10
# "c" loads the CUSTOMER table
./multi_load_tpch.sh 1 c "tpch/tpch@test" 10
# "p" loads the PART and PARTSUPP tables
./multi_load_tpch.sh 1 p "tpch/tpch@test" 10
# "o" loads the ORDERS and LINEITEM tables
./multi_load_tpch.sh 1 o "tpch/tpch@test" 10
Obviously, you need to change the connection string to match your environment.
Caveats:
- Obviously, they are supplied as is - use at your own discretion and risk.
- You need to have created the target schema tables and made sure they are empty as the SQL*Loader control files use APPEND.
- Bear in mind that choosing too high a number of parallel streams (the last parameter in the calls) will overload your machine so try and balance it against the available system resources.
Bugs, issues or questions, please get in touch...enjoy.
Labels: benchmarking, code, dba, DW, tuning
Monday, July 21, 2008
Creating a TPC-H schema with DBGEN on HP-UX
Given that I also need to use this tool to help with some benchmarking at work, I decided to try to get DBGEN to run on a HP-UX box today and had one or two problems which I managed to sort out. The machine in question is an RP8420 running HP-UX B.11.11.
DBGEN is a utility that allows you to create a series of flat files which contain the data for a TPC-H schema. You can then use SQL*Loader to load these into appropriately constructed tables in an Oracle database - any database actually...but I only care about Oracle of course ;-)
The utility can be called with various parameters including making the target datasets in smaller "child" files which can be created in a manually parallelised fashion to speed the whole process up. You have to download the DBGEN reference data set from the TPC website (lower right hand side).
This reference data set contains the ANSI C source code which makes the DBGEN executable (and QGEN also...but more on that another day)...unfortunately it's just the source code, so that means you need to compile it yourself...which of course leads to the first problem...that I know diddly squat about C...yeah I know, not very manly! Luckily I can sometimes follow instructions (which come with the reference data set)...
1. Copy makefile.suite to makefile
2. Edit makefile and make the following amendments (in red):################
## CHANGE NAME OF ANSI COMPILER HERE
################
CC = gcc
# Current values for DATABASE are: INFORMIX, DB2, TDAT (Teradata)
# SQLSERVER, SYBASE
# Current values for MACHINE are: ATT, DOS, HP, IBM, ICL, MVS,
# SGI, SUN, U2200, VMS, LINUX, WIN32
# Current values for WORKLOAD are: TPCH
DATABASE= ORACLE
MACHINE = HP
WORKLOAD = TPCH
That's it for the makefile.
Now, as I mentioned, the ORACLE database is not a listed database variant in the DBGEN C code - it's got all the other popular RDBMS which I find quite bizarre...I'm sure there's a reason, but I can't think of one. To get around this, as per Chapter 5 in “Pro Oracle Database 10g RAC on Linux” by Steve Shaw and Julian Dyke, I added a section to the tpcd.h for the ORACLE database:
#ifdef ORACLE
#define GEN_QUERY_PLAN ""
#define START_TRAN ""
#define END_TRAN ""
#define SET_OUTPUT ""
#define SET_ROWCOUNT ""
#define SET_DBASE ""
I thought that was it - but it still would't compile, giving the error:
config.h:213:2: #error Support for a 64-bit datatype is required in this release
Looking at the config.h - and bearing in mind I'm no C programmer - it struck me as odd that all bar the HP machine section, had stuff about DSS_HUGE and 64 bits...so I took a punt and copied some lines (in red) from the IBM section into the HP one to see if it worked...and it did. The HP section now looks like this:
#ifdef HP
#define _INCLUDE_POSIX_SOURCE
#define STDLIB_HAS_GETOPT
#define DSS_HUGE long long
#define HUGE_FORMAT "%lld"
#define HUGE_DATE_FORMAT "%02lld"
#define RNG_A 6364136223846793005ull
#define RNG_C 1ull
#endif /* HP */
I then downloaded the TPC-H specification document which has, amongst other things, the data model, from which I cross checked the column ordering of the data model against the columns in the output files and then managed to rerun the data in without any further issues.
Tomorrow I'll try running HammerOra against the target TPC-H schema and make some shell scripts to try and automate most of the process so we can build different scale factor schemae and do so in a manually parallelised fashion - scripts the amiable Scotsman created for his parallel testing a while back should give me a good start with that.
Labels: benchmarking, bugs, code, dba, tuning
Saturday, July 12, 2008
PC for manly men?
- Buy a Dell or HP high end PC from their website and pay serious money for it.
- Pick a proper server off Ebay - cheaper but may have pitfalls including warranty, dodgy sellers and delivery. (it was interesting to look for E10K Sun boxes on there)
- Spec a PC myself and get a box shifter to build and ship it.
- Upgrade my current PC with a selection of new bits.
I chose #3 and bought it from a company call Scan. I'm pretty happy with the result and the service I received although due to some DOA parts it took a little longer than I'd hoped...at least they had the problems to deal with instead of me!
#1 is expensive and you're sort of limited to the options they offer. I costed up something similar to what I've ended up buying and it was nearer 3,000 pounds rather than the 1800 ish that I paid.
#2 is cheaper than #1 but these type of machines are really noisy, albeit solid pieces of kit and more akin to what I'd work on during my day job.
#4 and #3 are similar except for who gets the grief of making all the new bits work together, and every time I try to build things myself I get grief with it (usually parts arriving DOA or incompatible with each other). Scan had to deal with a DOA motherboard and CPU amongst other things...rather them than me.
So, #3 it was...and it arrived a few days ago.
Specification is:
- Tyan S2932 Server motherboard
- Two, dual core AMD Opteron 2218 2.6GHz processors
- 8Gb of RAM (4 x 2Gb DDR2 667MHz)
- Six, 1 TB Samsung HD103UJ Spinpoint F1, SATA 300, 7200 rpm, 32MB Cache, 8.9 ms, NCQ drives
- Antec P190 midi tower case
Pictures (Click on them for bigger images) below:


Not quite an "enterprise server" and I'm sure it pales into insignificance against any of the kit Kevin uses, but pretty quick.
I've configured it for dual boot of Vista 64 Ultimate and Oracle Enterprise Linux 5 (using EasyBCD) and I'm about to start doing some installations and benchmarking...should be fun.
I installed VirtualBox on the Vista 64 OS and it's working very nicely...well, it's set up and working...we'll find out how nicely it's working when I install Oracle and HammerOra and give it a bit of a kicking!
Who knows, I might even find time to blog about it!
Labels: benchmarking
Tuesday, April 15, 2008
Oracle Optimized Warehouse Initiative (OWI)
It was an interesting day, although I didn't really hear anything new, per se. I think the main things I took away from the session were:
- The availability of a set of reference configurations providing matrices which cover various permutations of user count, storage cost focus, warehouse size and hardware vendor.
- The possibility of using the reference configurations either minimally, to simply cross check a proposed hardware specification for a given workload, to ensure it seems appropriate, or going the whole hog and using an "out of the box" reference configuration, delivered to your office, fully configured with all software installed, in a box, ready to run in your data.
- Oracle are pushing RAC and ASM heavily in the DW space - no surprise there.
- HammerOra and ORION are used by Oracle and the hardware vendors to assess the reference configurations...and there is nothing stopping you using them for your own benchmarking efforts
It was interesting to hear about the Proof Of Concept facility that Sun has, in Linlithgow, Scotland. The facility allows Sun (and an Oracle customer) to take a server off the production line and, working with the customer, test their workload on that machine to see if it's going to work or not. Neat, and since we're going to be using some Sun kit for a DW shortly, it sounds like an opportunity.
Funniest thing of the day for me, was the last slide in the pitch given by Mike Leigh of Sun which had the title "The Power of 2" and was illustrating the benefits to customers of Oracle and Sun as a united force. I didn't really take much notice, as I was too busy smiling, as I looked at the title and it made me think of Doug and his Parallel Execution and the 'Magic of 2' paper (the Magic of 2 bit actually being from this paper by Cary).
If you're building a warehouse, or just want to get an idea of whether your hardware is appropriate for the job, it's probably worth reading up on the OWI.
Thursday, April 10, 2008
Problem with _gby_hash_aggregation_enabled parameter
For three days, my colleagues in the support team on one of the warehouses I'm involved in, were struggling with a piece of code which was exhausting the available temp space and after trying everything they could think of, they asked me to take a look. I must admit I was a little baffled at first because the piece of code in question had been happily running for some time now and every time I'd run it, I'd never noticed that TEMP was anywhere near being exhausted so, whilst I could tell the process had some kind of problem, I was in the dark as to exactly what that problem was.
After trying to break down the large query into several smaller steps, I realised that it was an early step in the query that was exhibiting the problem of temp exhaustion - the step being a pivot of around 300 million measure rows into a pivot target table.
This process runs monthly and it had run without issue on all of the previous 20 occurrences and for roughly the same, if not more rows to pivot...so it didn't appear to be the volume that was causing the problem.
There had been no changes in the code itself for many months and the current version of the code had been run successfully in previous months, so it didn't appear to be a code fault.
I obtained the actual execution path for the statement whilst it was running and it looked reasonable, although looking at it, triggered a thought in my mind...what if something had changed in the database configuration?
Why would I get that thought from looking at the execution path?
Well, a while back, we had received some advice that a line in our init.ora as follows, should be changed to set the feature to TRUE instead of FALSE, so that the feature became active:
_gby_hash_aggregation_enabled = FALSE
This results in a line in the plan that reads:
HASH GROUP BY
instead of
SORT GROUP BY
The parameter was set to FALSE due to a known bug and the issues we'd seen with it, however the recent advice we'd received, indicated that the bug had been resolved at the version level we were on and that by enabling the feature - which enables GROUP BY and Aggregation using a hash scheme - we'd gain a performance boost for certain queries.
So, the DBA team researched the advice and it appeared to be the case, that the bug (4604970) which led to the disabling of the feature was fixed at our version level (10.2.0.3 on HP-UX). We duly turned on the feature in a pre production environment and ran it for a while without noticing any issues. We then enabled it in production and again, for a while, we've not noticed any issues...until now.
After a check back through the logs, it appeared that since the parameter was first enabled, the queries which were now failing, had not been run at all...they had only run prior to the parameter change...so with my suspicions aroused further, I disabled the feature at the session level and reran the process. It completed in a normal time frame and used a small amount of TEMP - hooray!
So, now we have to go back to support to try and understand if the original bug is not quite fixed or whether this is a different scenario...in any event, we're going to disable the feature for now, even though we're only getting problems with the feature on 2 processes out of perhaps thousands.
So, what's the lesson to learn?
Well, quite simply, that you need to have a thorough audit of what configuration changes you've made together with a good audit of the processes you've run so that you can work out what has changed since the last time you successfully ran a process. This gives you a fighting chance of spotting things like the above.
Monday, April 07, 2008
Get Human!
Enjoy.
Labels: social
Tuesday, February 26, 2008
Tracking TEMP usage throughout the day
I have various queries I run whilst I'm online and processes are running, but what I really wanted, was to know the usage profile throughout the day...so, given that I couldn't find an AWR way of tracking our use of TEMP on a database, I figured I'd use a more klunky method...
DROP TABLE mgmt_t_temp_use_history PURGE
/
CREATE TABLE temp_use_history(snap_date DATE NOT NULL
,sid NUMBER NOT NULL
,segtype VARCHAR2(9) NOT NULL
,qcsid NUMBER NULL
,username VARCHAR2(30) NULL
,osuser VARCHAR2(30) NULL
,contents VARCHAR2(9) NULL
,sqlhash NUMBER NULL
,sql_id VARCHAR2(13) NULL
,blocks NUMBER NULL
)
PCTFREE 0
COMPRESS
NOLOGGING
/
CREATE UNIQUE INDEX tuh_pk ON
temp_use_history(snap_date,sid,segtype)
PCTFREE 0
COMPRESS
NOLOGGING
/
ALTER TABLE temp_use_history ADD CONSTRAINT tuh_pk PRIMARY
KEY(snap_date,sid,segtype)
USING INDEX
/
DECLARE
l_program_action VARCHAR2(2000);
l_27477 EXCEPTION;
PRAGMA EXCEPTION_INIT(l_27477,-27477); BEGIN
BEGIN
DBMS_SCHEDULER.CREATE_SCHEDULE(
schedule_name => 'MINUTELY_5M'
,start_date => SYSDATE
,repeat_interval => 'FREQ=MINUTELY;INTERVAL=5'
,comments =>'Daily schedule to run a job every five minutes.'
);
EXCEPTION
WHEN l_27477 THEN
NULL; -- Ignore if the schedule exists
END;
l_program_action := 'DECLARE';
l_program_action := l_program_action||' l_date DATE := SYSDATE; ';
l_program_action := l_program_action||'BEGIN';
l_program_action := l_program_action||' INSERT /*+ APPEND */ INTO temp_use_history(snap_date,sid,qcsid,username,osuser,contents,seg
type,sqlhash,sql_id,blocks)';
l_program_action := l_program_action||' SELECT l_date,s.sid,ps.qcsid,s.username,s.osuser,su.contents,su.segtype,su.sqlh
ash,su.sql_id,sum(su.blocks)';
l_program_action := l_program_action||' FROM v$sort_usage su';
l_program_action := l_program_action||' , v$session s';
l_program_action := l_program_action||' , v$px_session ps';
l_program_action := l_program_action||' WHERE s.sid=ps.sid(+)';
l_program_action := l_program_action||' AND s.saddr =
su.session_addr';
l_program_action := l_program_action||' AND s.serial# =
su.session_num';
l_program_action := l_program_action||' GROUP BY s.sid,ps.qcsid,s.username,s.osuser,su.contents,su.segtype,su.sqladdr,su.
sqlhash,su.sql_id;';
l_program_action := l_program_action||' COMMIT; ';
l_program_action := l_program_action||'END;';
BEGIN
DBMS_SCHEDULER.CREATE_PROGRAM(
program_name => 'SNAP_TEMP_USAGE'
,program_type => 'PLSQL_BLOCK'
,program_action => l_program_action
,enabled => TRUE
,comments => 'Program to snap the temp usage into TEMP_USE_HISTORY'
);
EXCEPTION
WHEN l_27477 THEN
NULL; -- Ignore if the program exists
END;
BEGIN
DBMS_SCHEDULER.CREATE_JOB(
job_name => 'JOB_SNAP_TEMP_USAGE'
,program_name => 'SNAP_TEMP_USAGE'
,schedule_name => 'MINUTELY_5M'
,enabled => TRUE
,auto_drop => FALSE
,comments => 'Job to snap the temp usage into TEMP_USE_HISTORY'
);
EXCEPTION
WHEN l_27477 THEN
NULL; -- Ignore if the job exists
END;
END;
/
I can now run queries against the TEMP_USE_HISTORY table to show how much TEMP has been used, when, by whom and for what use, e.g.
SQL> ed
Wrote file afiedt.buf
1 select snap_date,round(sum(blocks)*32/1024/1024) gb
2 from temp_use_history
3 where snap_date > sysdate-1
4 group by snap_date
5 having round(sum(blocks)*32/1024/1024) > 50
6* order by 1
SQL> /
SNAP_DATE GB
-------------------- ----------
26-FEB-2008 16:12:25 57
26-FEB-2008 16:17:25 65
26-FEB-2008 16:22:25 74
26-FEB-2008 16:27:25 86
26-FEB-2008 16:32:25 95
26-FEB-2008 16:37:25 107
26-FEB-2008 16:42:25 121
26-FEB-2008 16:47:25 127
26-FEB-2008 16:52:25 147
26-FEB-2008 16:57:25 160
26-FEB-2008 17:02:25 162
26-FEB-2008 17:07:25 179
26-FEB-2008 17:12:25 196
26-FEB-2008 17:17:25 208
26-FEB-2008 17:22:25 217
26-FEB-2008 17:27:25 233
26-FEB-2008 17:32:25 241
26-FEB-2008 17:37:25 251
26-FEB-2008 17:42:25 257
26-FEB-2008 17:47:25 262
26-FEB-2008 17:52:25 264
26-FEB-2008 17:57:25 267
26-FEB-2008 18:02:25 201
27-FEB-2008 00:27:25 59
27-FEB-2008 00:32:25 60
27-FEB-2008 00:37:25 69
27-FEB-2008 01:12:25 57
27-FEB-2008 02:22:25 53
27-FEB-2008 09:57:25 51
29 rows selected.
From the above, I can see that, during the last twenty four hours on the database in question, there was reasonably heavy use of the TEMP area between 4pm and 6pm yesterday, and that the load peaked at approximately 267Gb.
Thursday, February 21, 2008
Getting multi row text items from Designer repository
What I wanted, was to be able to extract, using a SQL query against the designer repository, the "Description" attribute from our Tables so that I could create table comments in the database which had the description from the table in Designer.
Extracting scalar attributes is simple enough, however, description is a multi row text property so it needed a bit more thought...and rather than reinvent the wheel, I had a look at Lucas' stuff and sure enough found the post above. It was almost what I wanted, only I had to change it to look for table stuff rather than entities and attributes.
I used the same TYPE and sum_string_table function as Lucas so if you're trying to use this stuff you'll probably need to read Lucas' article first.
The query I ended up with is below...it's been "sanitized" somewhat, but I'm sure you'd get the picture, if retrieving stuff out of the designer repository is a requirement of yours.
WITH dsc AS
(
SELECT txt.txt_text
, txt.txt_ref
FROM cdi_text txt
WHERE txt.txt_type = 'CDIDSC'
)
, add_cast AS
(
SELECT appsys.name application_system_name
, b.name table_name
, b.alias
, CAST(COLLECT(dsc.txt_text) AS string_table) tab_description
FROM dsc
, designer.ci_application_systems appsys
, designer.ci_app_sys_tables a
, designer.ci_table_definitions b
WHERE dsc.txt_ref = a.table_reference
AND b.irid = a.table_reference
AND a.parent_ivid = appsys.ivid
GROUP BY appsys.name
, b.name
, b.alias
)
SELECT application_system_name
, table_name
, alias
, sum_string_table(tab_description)
FROM add_cast
WHERE application_system_name = 'MY_APP_SYS'
and table_name = 'MY_TABLE_NAME'
/
Thanks Lucas!
On another note, regular visitors may realise I've now got my own oramoss dot com domain and that my blogger blog is published on that domain now.
Thanks to Andreas Viklund for the template.
If anyone sees anything untoward with the new site please feel free to drop me a note. It's a bit thin on content but I'll work on that over time.
Wednesday, January 23, 2008
Advert: YouGoDo dot com
Some friends of mine have started a new website Called YouGoDo.
Billed as "The World's Playground", offering users a quick and simple answer to questions like:
I'm off to [location], what can I do there?
I want to do [activity], where can I do it?
I want to do [activity] in [location], who provides this service?
If you're interested, feel free to try them out.
Labels: social
Wednesday, January 09, 2008
Start the new year with a game of tag!
The idea behind the tagging being to “learn more about the people you interact with using New Web”.
So, here are eight pieces of trivia about myself:
1. I am the son of a couple of geordies. My dad being a miner and my mum being a machinist and taxi driver in her time. Mum and dad were married at Gretna Green. Mum is also fairly rare in that she is a successful heart transplant patient of nearly ten years.
2. I love animals - all animals really, but in particular dogs, and in particular Boxer dogs, and even more particularly white boxer dogs - and no, to scotch a common myth, not all white boxer dogs are deaf - it's the same percentage as all breeds of dog that are deaf - around 20%.
3. I'm not a vegetarian, despite my love of animals, however my wife is pretty much a vegetarian...which obviously makes life interesting in the kitchen department.
4. If I wasn't an Oracle person then I'd be a chef. In all truth, the only reason I haven't switched from being an Oracle person to a chef is that I don't think I could earn as much money in that field. Yes, Gordon Ramsey earns millions but it's taken him decades to get there and I'm not sure I'm as capable in that field as I am in the Oracle one. So the plan will be to work in Oracle until such time as I think I've comfortable and then I'll switch to something in the arena of food and drink.
5. I love car racing - particularly Formula 1. Some wifes are "Golf Widows" - mine is an "F1 Widow" and calls me a "saddo" whenever I'm surfin' on my laptop in the evening and she spots me looking at the ITV 1 Formula 1 website. Obviously I think I could go a fast as Lewis Hamilton but I'm all too aware that I'd probably end up looking like Richard Hammond if I tried!
6. I'm addicted to reading - both fiction and non. My ideal holiday being one where I just sit by the pool/on the beach, reading constantly for a fortnight.
7. Speaking of ideal holiday...my favourite place is the carribean, but then what's not to like there! My favourite island being St Lucia since that's where I got married to Amanda.
8. I have a burning desire to live in another country - preferably somewhere hot. Whether my desire is fulfilled who knows...watch this space!
That's my eight things...so over to a new set of people to continue the chain...I've no idea if these folk have been tagged or not yet - apologies if they have and the chain isn't extended but ultimately, isn't that inevitable?
Kevin Closson
Tanel Poder
Nicholas Goodman
Mogens Nogaard
Anjo Kolk
Pete Finnigan - how ironic with this tagging being effectively a virus!
Eric S Emrick
Christian Bilien
Labels: social
Tuesday, September 25, 2007
CALIBRATE_IO Poll
If you'd like to contribute please visit using the following link:
Oracle 11g CALIBRATE_IO Results
No prizes for the "biggest"...Kevin is bound to have something in his lab that beats us all! ;-)
Thanks
11g IO Calibration tool
Obviously, any experiment that goes flawlessly according to a preconceived plan is:
1. Boring
2. Less educational
3. Not normally one I've done - mine always have problems it seems!
I ran the calibration on my VMWare based OpenSuse 10 linux with Oracle 11g and it immediately came up with a problem:
SQL> @io
SQL> SET SERVEROUTPUT ON
SQL> DECLARE
2 lat INTEGER;
3 iops INTEGER;
4 mbps INTEGER;
5 BEGIN
6 -- DBMS_RESOURCE_MANAGER.CALIBRATE_IO (, , iops, mbps, lat);
7 DBMS_RESOURCE_MANAGER.CALIBRATE_IO (2, 10, iops, mbps, lat);
8
9 DBMS_OUTPUT.PUT_LINE ('max_iops = ' || iops);
10 DBMS_OUTPUT.PUT_LINE ('latency = ' || lat);
11 dbms_output.put_line('max_mbps = ' || mbps);
12 end;
13 /
DECLARE
*
ERROR at line 1:
ORA-56708: Could not find any datafiles with asynchronous i/o capability
ORA-06512: at "SYS.DBMS_RMIN", line 453
ORA-06512: at "SYS.DBMS_RESOURCE_MANAGER", line 1153
ORA-06512: at line 7
Of course, consulting the manual led me to run this query:
SELECT name, asynch_io
FROM v$datafile f,v$iostat_file i
WHERE f.file# = i.file_no
AND filetype_name = 'Data File'
/
which gave:
SQL> /
NAME ASYNCH_IO
-------------------------------------------------- ---------
/home/oracle/oradata/su11/system01.dbf ASYNC_OFF
/home/oracle/oradata/su11/sysaux01.dbf ASYNC_OFF
/home/oracle/oradata/su11/undotbs01.dbf ASYNC_OFF
/home/oracle/oradata/su11/users01.dbf ASYNC_OFF
/home/oracle/oradata/su11/example01.dbf ASYNC_OFF
...or in other words no asynchronous IO available - as the error message had said.
After altering the filesystemio_options parameter to "set_all" and bouncing the instance, a second run of the calibration process seemed to work fine...
SQL> @io
SQL> SET ECHO ON
SQL> SET SERVEROUTPUT ON
SQL> DECLARE
2 lat INTEGER;
3 iops INTEGER;
4 mbps INTEGER;
5 BEGIN
6 -- DBMS_RESOURCE_MANAGER.CALIBRATE_IO (, , iops, mbps, lat);
7 DBMS_RESOURCE_MANAGER.CALIBRATE_IO (2, 10, iops, mbps, lat);
8
9 DBMS_OUTPUT.PUT_LINE ('max_iops = ' || iops);
10 DBMS_OUTPUT.PUT_LINE ('latency = ' || lat);
11 dbms_output.put_line('max_mbps = ' || mbps);
12 end;
13 /
max_iops = 72
latency = 13
max_mbps = 26
PL/SQL procedure successfully completed.
So,my figures are considerably lower than those Doug achieved:
max_iops = 112
latency = 8
max_mbps = 32
...but not too bad I guess considering the fact that mine is a VM and the hardware I'm running is more humble...no seriously, size does not matter!
Tuesday, September 04, 2007
11g PX tracefiles now have the tracefile identifier on them
In 10gR2 (10.2.0.2.0 specifically) the trace files would come out with names in this format:
<instance name>_<PX Slave identifier>_<background process ID>.trc
e.g. fred_p001_6789.trc
In 11gR1 (11.1.0.6.0 specifically) the trace files come out with names in this format:
<instance name>_<PX Slave identifier>_<background process ID>_<Trace file identifier>.trc
e.g. fred_p001_5678_jeff.trc
This assumes you've set the tracefile identifier in the first place, otherwise that bit won't be present. Use the following to set it, choosing whatever identifier you require of course:
alter session set tracefile_identifier='jeff';
It was interesting that the location of such files has also changed due to the implementation of Automatic Diagnostic Repository (ADR). More information on that here.
Monday, August 13, 2007
ORA-07455 and EXPLAIN PLAN...and statements which, perhaps, shouldn't run
I encountered a scenario today which I thought was strange in a number of ways...hence, irresistible to a quick blog post.
The scenario started with an end user of my warehouse emailing me a query that was returning an error message dialog box, warning the user before they ran the query, that they had insufficient resources to run said query - ORA-07455 to be precise.
I figured, either the query is one requiring significant resources - more resources than the user has access to, or the query has a suboptimal plan, whereby it thinks it will require more resources than they have access to.
To try and determine which, I logged into the same Oracle user as my end user and tried to get an explain plan of the query - so I could perhaps gauge whether there were any problems with the choice of execution path and whether the query was one which would indeed require significant resources.
The result was that it came back with the same error - which quite surprised me at first.
In using EXPLAIN PLAN, I wasn't asking the database to actually run the query - merely to tell me what the likely execution path was for the query and yet, it appears to still do the checks on resource usage. At first, that seemed strange to me, in the sense that I wouldn't be requiring those resources since, I'm not actually executing the statement, yet perhaps it does makes sense - or at least is consistent, because, for example, you don't need access to all the objects in the query if you're not going to actually execute it, yet quite rightly, the optimizer does checks as to whether you have the appropriate access permissions to each object as part of the EXPLAIN PLAN process.
That was educational point number one for me.
After logging in as another user with unlimited resource usage, I then reran the EXPLAIN PLAN and the statement was accepted and the plan returned...indicating an unpleasant rewrite of the query, and a very high anticipated cost - in excess of the limit for that end user.
That explained why the ORA-07455 was appearing for them, but highlighted an altogether different issue which perplexed me further. There follows a simple reconstruction of the query and explain plan results:
First the obligatory test script...
SET TIMING OFF
DROP TABLE tab1 PURGE
/
CREATE TABLE tab1
(col1 VARCHAR2(1))
/
DROP TABLE tab2 PURGE
/
CREATE TABLE tab2
(col2 VARCHAR2(1))
/
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(ownname => USER
,tabname => 'TAB1'
);
DBMS_STATS.GATHER_TABLE_STATS(ownname => USER
,tabname => 'TAB2'
);
END;
/
INSERT INTO tab1 VALUES('A')
/
INSERT INTO tab1 VALUES('B')
/
INSERT INTO tab1 VALUES('A')
/
INSERT INTO tab1 VALUES('B')
/
INSERT INTO tab2 VALUES('C')
/
INSERT INTO tab2 VALUES('D')
/
COMMIT
/
SET AUTOTRACE ON
SELECT *
FROM tab1
WHERE col1 IN (SELECT col1 FROM tab2)
/
SET AUTOTRACE OFF
Now the results...
Table dropped.
Connected.
Table dropped.
Table created.
Table dropped.
Table created.
PL/SQL procedure successfully completed.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
Commit complete.
C
-
A
B
A
B
4 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 4220095845
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
|
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 | 4 (0)|00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL | TAB1 | 1 | 2 | 2 (0)|00:00:01 |
|* 3 | FILTER | | | | | |
| 4 | TABLE ACCESS FULL| TAB2 | 1 | | 2 (0)|00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT /*+ */ 0 FROM "TAB2" "TAB2" WHERE
:B1=:B2))
3 - filter(:B1=:B2)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
18 consistent gets
0 physical reads
0 redo size
458 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
4 rows processed
Now, when I first saw the query I thought, hang on a minute, COL1 does not exist in table TAB2 so this query should not even execute...but it does! I don't think it should execute personally but according to the documentation, "Oracle resolves unqualified columns in the subquery by looking in the tables named in the subquery and then in the tables named in the parent statement.", so it is operating as described in the manuals- even if, in my view, it's a little odd since without a rewrite, the query is incapable of executing.
The query has been rewritten with an EXISTS approach - note the first FILTER statement in the "Predicate Information" section of the autotrace output. A bit like this:
SELECT a.*
FROM tab1 a
WHERE EXISTS (SELECT 0
FROM tab2 b
WHERE a.col1 = a.col1
)
/
The subquery is always going to return a row, hence, for any row we select in the containing query, we will always get that row back, because the EXISTS will always find a match - it's a bit like saying "WHERE TRUE" I guess.
Interestingly, my friend Jon first brought this scenario to my attention last week in various discussions with him and another of my colleagues, who is far more experienced than myself. To be fair, the experienced colleague is the source of a number of my blogging posts, but he's painfully shy and will therefore remain nameless.
I was educated during that discussion, that this functionality is as advertised in the manuals - even if it doesn't sit well with me. My closing line to my fellow debaters at the time, was that nobody would ever write SQL like that and if they did I'd tell them to rewrite it using aliases and so that it made sense - as is often the case in life though, the very next week, a real life user comes up with exactly that scenario - at least I was prepared!
Labels: cbo, code, errors, tuning
Saturday, August 04, 2007
Fifteen Minutes of fame as an animal welfare supporter
My wife and I were happy to receive a call from Respect For Animals the other day, to inform us that our names would appear on a two page advert in the August issue of the BBC Wildlife magazine, in support of the campaign to stop the seal cull in Canada.
I'd never read the magazine before, but after buying it to see the advert, I was quite pleased that it was a really good read for those of us interested in the animal kingdom.
Labels: social
Subscribe to Posts [Atom]
