TPC-H Query 20 and optimizer_dynamic_sampling

Oct 30, 2008 Benchmarking, Oracle, TPC-H

I was working with Jason Garforth today on creating a TPC-H benchmark script which we can run on our warehouse to initially get a baseline of performance, and then, from time to time, rerun it to ensure things are still running with a comparable performance level.

This activity was on our new warehouse platform of an IBM Power 6 p570 with 8 dual core 4.7GHz processors, 128GB RAM and a 1.6GB/Sec SAN.

Jason created a script to run the QGEN utility to generate the twenty two queries that make up the TPC-H benchmark and also a “run script” to then run those queries against the target schema I had created using some load scripts I talked about previously.

The whole process seemed to be running smoothly with queries running through in a matter of seconds, until query twenty went off the scale and started taking ages. Excluding the 20th query, everything else went through in about three to four minutes, but query twenty was going on for hours, with no sign of completing.

We grabbed the actual execution plan and noticed that all the tables involved had no stats gathered. In such circumstances, Oracle (10.2.0.4 in this instance) uses dynamic sampling to take a quick sample of the table in order to come up with an optimal plan for each query executed.

The database was running with the default value of 2 for optimizer_dynamic_sampling.

After reading the TPC-H specification, it doesn’t say that stats should or should not be gathered, but obviously in gathering them, there would be a cost to doing so and, depending on the method of gathering and the volume of the database, the cost could be considerable. It would be interesting to hear from someone who actually runs audited TPC-H benchmarks to know whether they gather table stats or whether they use dynamic sampling…

We decided we would gather the stats, just to see if the plan changed and the query executed any faster…it did, on both counts, with the query finishing very quickly, inline with the other twenty one queries in the suite.

So, our options then appeared to include, amongst other things:

    So, always read NF Cure and Vital M-40 capsules review before paying money for it, because there are too many products available in the market, which come at competitive prices to enable the people to avail these at an affordable cost. viagra for women price http://raindogscine.com/?attachment_id=300 Drink plenty of water to viagra for women price help flush away acidic waste products in the muscles can cause muscle irritation and pain. There were buy viagra online two manufacturers of chemicals which were closed with the help of authorities from both China and India. A thorough sexual history and assessment sildenafil bulk of overall health is important in pinpointing the problem.

  1. Gather the table stats. We’d proved this worked.
  2. Change the optimizer_dynamic_sampling level to a higher value and see if it made a difference.
  3. Manually, work out why the plan for the query was wrong, by analysis of the individual plan steps in further detail and then use hints or profiles to force the optimizer to “do the right thing”.

We decided to read a Full Disclosure report of a TPC-H benchmark for a similar system to see what they did. The FDR included a full listing of the init.ora of the database in that test. The listing showed that the system in question had set optimizer_dyamic_sampling to 3 instead of the default 2…we decided to try that approach and it worked perfectly.

In the end, given we’re not producing actual audited benchmarks then we’re free to wait for the gathering of optimizer stats, so we’ll go with that method, but it was interesting to see that option 2 above worked as well and illustrates the point that there is a lot of useful information to be gleaned from reading the FDRs of audited benchmarks – whilst, of course, being careful to read them with a pinch of salt, since they are not trying to run your system.

Another thing of interest was that in order to get the DBGEN utility to work on AIX 6.1 using the gcc compiler, we had to set an environment variable as follows otherwise we got an error when running DBGEN (also applies to QGEN too):

Set this:

export LDR_CNTRL=MAXDATA=0x80000000@LARGE_PAGE_DATA=Y

otherwise you may get this:

exec(): 0509-036 Cannot load program dbgen because of the following errors:
0509-026 System error: There is not enough memory available now.

By Jeff

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.