Is ASSM evil in a DW?

I saw an interesting article on ASSM from Howard the other day. I wanted to comment on it but unless I’m being a bit thick, it doesn’t seem to be a blogpost and therefore I couldn’t and instead, I thought I’d do it here.

As I said, I found the article interesting and informative but I did have a few “But what about…?” issues that popped into my head as I read through it.

It probably won’t surprise anyone who reads my blog that my queries relate to the use of ASSM on a datawarehouse environment given my interest in that area.

Firstly, Howard asserted that ASSM wouldn’t be helpful in scenarios where you recieved sorted data from a warehouse source which you wanted to load up into a target table in order that you could gain a performance benefit from creating an index on that data whilst taking advantage of the NOSORT option – thereby speeding up the index creation considerably. Nothing Howard said was untrue although there are a number of things that could be discussed in relation to this point…

Firstly, in my experience, most warehouse processing uses Oracle parallel execution for performance – but if one were to load up data into a target table using parallel processing, then even if it were sorted in the source, the use of parallel processing would result in the target being unsorted and hence this advantage would be lost. Of course you could resort to serial processing but would that be more efficient than doing the work in parallel and then sorting the data for the index build again using parallel processing techniques – each process is probably different but I guess what I’m saying is that it’s possible that the NOSORT issue is not necessarily relevant in all cases especially those on a DW using parallel execution.

There are a number of caveats with the use of the NOSORT option such as:

You cannot specify REVERSE with this clause.
You cannot use this clause to create a cluster index partitioned or bitmap index.
You cannot specify this clause for a secondary index on an index-organized table.

Read more here

The use of Direct Path operations on a DW might also mean that the space wastage doesn’t occur given Oracle constructs full blocks from clean ones rather than via free list management or the ASSM equivalent.

Howard suggested that tests he has undertaken showed a 1.5% excess space utilisation when ASSM was involved on a 1Tb table. A long time ago I would have been horified to lose 15Gb but these days I probably wouldn’t be as concerned, particularly as I often find ways to save that kind of storage just as easily in other areas, e.g. Ensuring we use compression appropriately for tables and indexes and ensuring we only have indexes we need and which are of the right type and structure. If the solution to save that 1.5% were simple then I’d consider that too – not using ASSM sounds like a simple solution but I guess it depends on what the DBA standards are for the organisation amongst other factors.

Howard mentioned that with ASSM “wasting” space and using more blocks to store the same number of rows, that would mean a full scan would need to read more blocks for a given table and that the numerous, repeatedly accessed and therefore “hot”, ASSM bitmap blocks could lead to other, “warm” data being aged out sooner than would otherwise be the case – It’s a reasonable hypothesis and if we were full scanning a 1Tb table that had 1.5% or 15Gb of “waste” – I’m making a big assumption due to my lack of research here, in guessing that the 15Gb is largely made up of these ASSM bitmap blocks – then yes, 15Gb of blocks could have a considerable impact on a buffer cache…but that’s assuming you’d be reading the whole 1Tb table via a full scan – one would imagine the table is partitioned by some reasonable key (hopefully including time) and that therefore you’d only be reading a much smaller subset with the consequent reduced effect on the buffer cache. The effect is there but it’s impact would be system/query dependent.

Since the read (full scan) of the table itself would result in blocks going on the LRU end of the LRU list for the buffer cache then that wouldn’t affect the caching of hot/warm data at all – it’s just a matter of whether the ASSM bitmap blocks would be aging out other “warm” data to the detriment of overall system performance.

Whilst it raised a number of queries, it’s definitely the case that the article Howard posted has raised quite a debate within the DBA team on the warehouse I’m currently working on…should be fun this week on the warehouse as we contemplate peforming a number of benchmarks to try and ascertain which way we should jump – if any!

On another note, even though he’s moved on to pastures new, I think Doug Burns must have had something to do with the latest Sky TV adverts as the phone number they wanted you to call was 08702 42 42 42 – now that’s what I call leaving a lasting impression!