Table Compression – Order For Maximum Compression – Code Utility

David Aldridge wrote about choosing an ordering method for compressed tables here. I’ve added to his post that I’d written a simple routine to get the number of blocks, rows and space for a table when it’s ordered by each of the columns in the table so I can work out the compressability (is that a word ?) for each column and then choose the best column(s) to order my data by whether it be for maximum compressability or accessibility.

Code is available here. Feel free to use and abuse it for your own environment – usual caveats apply, i.e. what works for me may not work for you and you should test before relying on it! This example runs in a user called AE_MGMT who has the appropriate privileges to access the DBA Views in PL/SQL and also the table being assessed…you might need to play with that in your environment a little.

Execution takes 4 parameters:

TABLE OWNER Defaults to current USER
TABLE NAME Must supply this
PARTITION NAME Defaults to NULL
SAMPLE SIZE Defaults to 1,000,000

Example output:


ae_mgmt[147/13]@HOPE> exec mgmt_p_get_max_compress_order(‘SYSTEM’,’J1′,NULL,100000);
Running for TABLE: J1; SAMPLE_SIZE: 100000
Unique ID: 21112005095941
Creating MASTER Table…
Creating COLUMN Table 1:OWNER…
Creating COLUMN Table 2:OBJECT_NAME…
Creating COLUMN Table 3:SUBOBJECT_NAME…
Creating COLUMN Table 4:OBJECT_ID…
Creating COLUMN Table 5:DATA_OBJECT_ID…
Creating COLUMN Table 6:OBJECT_TYPE…
Creating COLUMN Table 7:CREATED…
Creating COLUMN Table 8:LAST_DDL_TIME…
Creating COLUMN Table 9:TIMESTAMP…
Creating COLUMN Table 10:STATUS…
Creating COLUMN Table 11:TEMPORARY…
Creating COLUMN Table 12:GENERATED…
Creating COLUMN Table 13:SECONDARY…
NAME                        COLUMN          BLOCKS ROWS  SPACE_GB
=========================== =============== ====== ===== ========
TEMP_COL_004_21112005095941 OBJECT_ID          322 99999    .0098
TEMP_COL_008_21112005095941 LAST_DDL_TIME      331 99999    .0101
TEMP_COL_007_21112005095941 CREATED            338 99999    .0103
TEMP_COL_009_21112005095941 TIMESTAMP          340 99999    .0104
TEMP_COL_002_21112005095941 OBJECT_NAME        410 99999    .0125
TEMP_COL_005_21112005095941 DATA_OBJECT_ID     427 99999     .013
TEMP_COL_011_21112005095941 TEMPORARY          433 99999    .0132
TEMP_COL_012_21112005095941 GENERATED          434 99999    .0132
TEMP_COL_010_21112005095941 STATUS             434 99999    .0132
TEMP_COL_013_21112005095941 SECONDARY          434 99999    .0132
TEMP_COL_003_21112005095941 SUBOBJECT_NAME     434 99999    .0132
TEMP_COL_001_21112005095941 OWNER              523 99999     .016
TEMP_COL_006_21112005095941 OBJECT_TYPE        532 99999    .0162

PL/SQL procedure successfully completed.

Elapsed: 00:01:17.07

So, in the above example, if I want to order the table for maximum compression I’d use OBJECT_ID to order the data by.

It only works out the compression for individual columns so combinations are not covered – you can test that individually afterwards once you’ve narrowed down the best columns – I might retest the above with ordering by combinations of the first 4 columns given the big jump in blocks from 340 on TIMESTAMP to 410 on OBJECT_NAME – starting with the first 2 columns, then adding column 3 etc.. until adding the columns has no marginal effect….pointless sorting by more and more columns that aren’t adding anything – will just take more time/resources for the sort.

Of course, as was mentioned on David’s post/link, you might want to order things to suit the access path rather than for maximum compression – your mileage may vary. I built it for work on a fact table where I wanted maximum compression and where the access path would generally be either a partition/full scan or one of several equally common access paths.