Skip to content

RANK v DENSE_RANK v ROW_NUMBER

I was asked a question by one of my users about the difference between the RANK and ROW_NUMBER analytics yesterday, so here is a post on it…

RANK, ROW_NUMBER and also DENSE_RANK are very useful for taking a set of rows and ordering them in a defined manner, whilst giving each row a “position value”. They differ based on the approach taken to define the value of their position in the set of output rows. In some circumstances, they may all give the same value however, dependent on the data, they may differ.

An example based on the SCOTT.EMP table, helps to illustrate…

SELECT empno
,      sal
,      RANK() OVER(ORDER BY sal) rank_position
,      DENSE_RANK() OVER(ORDER BY sal) dense_rank_position
,      ROW_NUMBER() OVER(ORDER BY sal) row_number_position
FROM   emp
/

which returns, on my 11gR1 database…

EMPNO        SAL RANK_POSITION DENSE_RANK_POSITION ROW_NUMBER_POSITION
---------- ---------- ------------- ------------------- -------------------
7369        800             1                   1                   1
7900        950             2                   2                   2
7876       1100             3                   3                   3
7521       1250             4                   4                   4
7654       1250             4                   4                   5
7934       1300             6                   5                   6
7844       1500             7                   6                   7
7499       1600             8                   7                   8
7782       2450             9                   8                   9
7698       2850            10                   9                  10
7566       2975            11                  10                  11
7788       3000            12                  11                  12
7902       3000            12                  11                  13
7839       5000            14                  12                  14
14 rows selected.

Notice that RANK has given the two employees with SAL = 1250, the same position value of 4 and the two employees with SAL=3000, the same position value of 12. Notice also that RANK skips position values 5 and 13 as it has two entries for 4 and 12 respectively. RANK uses all numbers between 1 and 14, except 5 and 13. RANK has both repeats and gaps in it’s ordering.

DENSE_RANK is similar to RANK, in that it gives the two employees with SAL=1250, the same position value of 4, but then it does not skip over position value 5 – it simply carries on at position 5 for the next values. DENSE_RANK uses, for the position values, all numbers between 1 and 12, without leaving any out, and using 4 and 11 twice. DENSE_RANK has no gaps in it’s ordering, only repeats.

ROW_NUMBER gives each row a unique position value and consequently uses all the numbers between 1 and 14. ROW_NUMBER has no gaps or repeats in it’s ordering. Note that the position value on ROW_NUMBER is not deterministic, since the ORDER BY clause only has SAL in it. If you want to ensure the order is the same each time, you need to add further columns to the ORDER BY clause.

Website revamp

In case anybody noticed, I’ve facelifted my website to use MediaWiki and to migrate my blogger blog into Wordpress, as they’ve stopped supporting FTP publishing.

I’ve used a skin for MediaWiki from Paul Gu – thanks Paul – and a skin for Wordpress from Srini G – thanks Srini.

Running OBIEE Oracle By Example Tutorials against a database not called ORCL

I’ve been working with OBIEE for a while now, but I’ve not actually gone through the Oracle By Example tutorials, so I figured it would be a good idea to do that.

I started looking at the first Oracle By Example OBIEE tutorial yesterday and came across an issue with a simple solution to share with you.

The tutorials have a few caveats about what they expect from your environment, when you’re going to run through them – one of them being that you have access to a 10g database. What the prerequisites don’t specifically say is that unless your database is called ORCL, you’ll need to jump through a few more hoops – as I did, with my database called TEST.

I followed the instructions from the tutorial such that I had:

  1. An SH schema in the 10g database, with the standard tables and data Oracle supply.
  2. Created an ODBC Data Source pointing to a database called TEST, checking it functioned correctly.
  3. Restored the presentation catalog and updated the configuration files accordingly.

I then proceeded to login to the BI Dashboards which brought up half the display, but it was full of TNS errors indicating that the connection could not be made between the BI Server and the database where the SH schema resides:

BI Server TNS Errors

After some investigation in the Administration tool, I discovered that the Connection Pool setting was using a Data Source Name called “ORCL” which doesn’t match my TNS/Database called TEST, hence it couldn’t make the connection to the database:

ORCL Connection Pool

Now, the RPD was read only at the time, so I first shut down the services so it could be opened read/write:

BI services down

…logged into the Administration tool using Administrator user (Password Administrator), opened the SH.RPD file read/write and modified the Data Source Name in the Connection Pool from ORCL to TEST, whilst ensuring the password for the SH user matched that of my TEST database:

Change ORCL to TEST

…next I restarted the services:

BI services up

…and then logged on again, to find it all now worked:

TEST all working

Congratulations to my mate Paul!

Just a quick note to say congratulations to Paul Till, a mate of mine, at my current client, who has recently passed his OCM certification. I knew Paul was good, from having worked with him, on a DR implementation / upgrade for a large DW, but I hadn’t realised how good. As certifications go, it’s the daddy and the Oracle one to have.

$deleted$ tablespace names bug

This one turned out to be a an interesting bug the other day…

I did a simple select from DBA_TAB_PARTITIONS and noticed that some tablespace_names were of the form “_$deleted$n$m” where n and m are numbers. Slightly worrying, but at least the data was all present and correct, when I checked. I knew the DBA team had been doing some reorganisations the previous weekend, to recover some space, so I wondered if that was connected….it was, and after opening an SR, the DBA, Phil, found an explanation (from Oracle Note: 604648.1) and a resolution.

Reproducing the issue and the way to fix it, is simple, using this script…

DROP TABLESPACE old_tbs INCLUDING CONTENTS AND DATAFILES;

CREATE TABLESPACE new_tbs DATAFILE 'C:\APP\ORACLE\ORADATA\T111\NEW_TBS.DBF'SIZE 100MONLINE;

CREATE TABLESPACE old_tbs DATAFILE 'C:\APP\ORACLE\ORADATA\T111\OLD_TBS.DBF'SIZE 100MONLINE;

SELECT ts#,name FROM sys.ts$ WHERE name LIKE '%TBS';

CREATE TABLE jeff_test(col1 DATE NOT NULL                      ,col2 NUMBER NOT NULL                      ,col3 VARCHAR2(200) NOT NULL                      )TABLESPACE old_tbsPARTITION BY RANGE(col1)SUBPARTITION BY LIST(col2)SUBPARTITION TEMPLATE( SUBPARTITION "S1" VALUES(1),SUBPARTITION "S2" VALUES(2))(PARTITION p1 VALUES LESS THAN(TO_DATE('31-DEC-2009','DD-MON-YYYY')),PARTITION p2 VALUES LESS THAN(TO_DATE('31-DEC-2010','DD-MON-YYYY')))/

SELECT partition_name,tablespace_name FROM dba_tab_partitions WHERE table_name='JEFF_TEST';SELECT subpartition_name,tablespace_name FROM dba_tab_subpartitions WHERE table_name='JEFF_TEST';

ALTER TABLE jeff_test MOVE SUBPARTITION p1_s1 TABLESPACE NEW_TBS;ALTER TABLE jeff_test MOVE SUBPARTITION p1_s2 TABLESPACE NEW_TBS;ALTER TABLE jeff_test MOVE SUBPARTITION p2_s1 TABLESPACE NEW_TBS;ALTER TABLE jeff_test MOVE SUBPARTITION p2_s2 TABLESPACE NEW_TBS;

DROP TABLESPACE old_tbs INCLUDING CONTENTS AND DATAFILES;ALTER TABLESPACE new_tbs RENAME TO old_tbs;

SELECT partition_name,tablespace_name FROM dba_tab_partitions WHERE table_name='JEFF_TEST';SELECT subpartition_name,tablespace_name FROM dba_tab_subpartitions WHERE table_name='JEFF_TEST';

ALTER TABLE jeff_test MODIFY DEFAULT ATTRIBUTES FOR PARTITION p1 TABLESPACE old_tbs;ALTER TABLE jeff_test MODIFY DEFAULT ATTRIBUTES FOR PARTITION p2 TABLESPACE old_tbs;

SELECT partition_name,tablespace_name FROM dba_tab_partitions WHERE table_name='JEFF_TEST';SELECT subpartition_name,tablespace_name FROM dba_tab_subpartitions WHERE table_name='JEFF_TEST';

Which, when run in 11.1.0.6 on Windows 2003 Server 64 bit, gives:

DROP TABLESPACE old_tbs succeeded.CREATE TABLESPACE succeeded.CREATE TABLESPACE succeeded.TS#                    NAME                           ---------------------- ------------------------------ 9                      NEW_TBS                        10                     OLD_TBS                        

2 rows selected

CREATE TABLE succeeded.PARTITION_NAME                 TABLESPACE_NAME                ------------------------------ ------------------------------ P1                             OLD_TBS                        P2                             OLD_TBS                        

2 rows selected

SUBPARTITION_NAME              TABLESPACE_NAME                ------------------------------ ------------------------------ P1_S2                          OLD_TBS                        P1_S1                          OLD_TBS                        P2_S2                          OLD_TBS                        P2_S1                          OLD_TBS                        

4 rows selected

ALTER TABLE jeff_test succeeded.ALTER TABLE jeff_test succeeded.ALTER TABLE jeff_test succeeded.ALTER TABLE jeff_test succeeded.DROP TABLESPACE old_tbs succeeded.ALTER TABLESPACE new_tbs succeeded.PARTITION_NAME                 TABLESPACE_NAME                ------------------------------ ------------------------------ P1                             _$deleted$10$0                 P2                             _$deleted$10$0                 

2 rows selected

SUBPARTITION_NAME              TABLESPACE_NAME                ------------------------------ ------------------------------ P1_S2                          OLD_TBS                        P1_S1                          OLD_TBS                        P2_S2                          OLD_TBS                        P2_S1                          OLD_TBS                        

4 rows selected

ALTER TABLE jeff_test succeeded.ALTER TABLE jeff_test succeeded.PARTITION_NAME                 TABLESPACE_NAME                ------------------------------ ------------------------------ P1                             OLD_TBS                        P2                             OLD_TBS                        

2 rows selected

SUBPARTITION_NAME              TABLESPACE_NAME                ------------------------------ ------------------------------ P1_S2                          OLD_TBS                        P1_S1                          OLD_TBS                        P2_S2                          OLD_TBS                        P2_S1                          OLD_TBS                        

4 rows selected

Notice that the $n in “_$deleted$n$m” is 10, which is the ts# of the OLD_TBS before the rename. The problem revolves around entries in TS$, when you rename tablespaces to names that have previously been used and then dropped, basically because the old entries are not removed from TS$.

Related references:
Bug Numbers:8291493, itself a duplicate of 5769963
Note: 604648.1

According to the SR and bug, it was noticed in 10.2.0.4 and is fixed in 10.2.0.5. We’ve reproduced it in 11.1.0.6 on various ports, (results above) and updated our SR, so I guess the fix might also find it’s way into 11.1.0.7, perhaps.

No pruning for MIN/MAX of partition key column

Recently, I wanted to work out the maximum value of a column on a partitioned table. The column I wanted the maximum value for, happened to be the (single and only) partition key column. The table in question was range partitioned on this single key column, into monthly partitions for 2009, with data in all the partitions behind the current date, i.e. January through mid June were populated. There were no indexes on the table.

NOTE – I tried this on 10.2.04 (AIX) and 11.1.0 (Fedora 11) – the example below is from 11.1.0.

I’ll recreate the scenario here:

CREATE TABLESPACE tsp1datafile '/u01/app/oracle/oradata/T111/tsp1.dbf' size 100M autoextend off extent management local  uniform size 1m segment space management auto online/CREATE TABLESPACE tsp2datafile '/u01/app/oracle/oradata/T111/tsp2.dbf' size 100M autoextend off extent management local  uniform size 1m segment space management auto online/

DROP TABLE test PURGE/CREATE TABLE test(col_date_part_key DATE            NOT NULL                 ,col2              VARCHAR2(2000)  NOT NULL                 )PARTITION BY RANGE(col_date_part_key)(PARTITION month_01 VALUES LESS THAN (TO_DATE('01-FEB-2009','DD-MON-YYYY')) TABLESPACE tsp1,PARTITION month_02 VALUES LESS THAN (TO_DATE('01-MAR-2009','DD-MON-YYYY')) TABLESPACE tsp2,PARTITION month_03 VALUES LESS THAN (TO_DATE('01-APR-2009','DD-MON-YYYY')) TABLESPACE tsp2,PARTITION month_04 VALUES LESS THAN (TO_DATE('01-MAY-2009','DD-MON-YYYY')) TABLESPACE tsp2,PARTITION month_05 VALUES LESS THAN (TO_DATE('01-JUN-2009','DD-MON-YYYY')) TABLESPACE tsp2,PARTITION month_06 VALUES LESS THAN (TO_DATE('01-JUL-2009','DD-MON-YYYY')) TABLESPACE tsp2,PARTITION month_07 VALUES LESS THAN (TO_DATE('01-AUG-2009','DD-MON-YYYY')) TABLESPACE tsp2,PARTITION month_08 VALUES LESS THAN (TO_DATE('01-SEP-2009','DD-MON-YYYY')) TABLESPACE tsp2,PARTITION month_09 VALUES LESS THAN (TO_DATE('01-OCT-2009','DD-MON-YYYY')) TABLESPACE tsp2,PARTITION month_10 VALUES LESS THAN (TO_DATE('01-NOV-2009','DD-MON-YYYY')) TABLESPACE tsp2,PARTITION month_11 VALUES LESS THAN (TO_DATE('01-DEC-2009','DD-MON-YYYY')) TABLESPACE tsp2,PARTITION month_12 VALUES LESS THAN (TO_DATE('01-JAN-2010','DD-MON-YYYY')) TABLESPACE tsp2)/REM Insert rows, but only up to 14-JUN-2009INSERT INTO test(col_date_part_key,col2)SELECT TO_DATE('31-DEC-2008','DD-MON-YYYY') + l,      LPAD('X',2000,'X')FROM   (SELECT level l FROM dual CONNECT BY level < 166)/COMMIT/SELECT COUNT(*)FROM   test/SELECT MIN(col_date_part_key) min_date,      MAX(col_date_part_key) max_dateFROM   test/

This runs and gives the following output:

DROP TABLE test PURGE                                                          *                                                        ERROR at line 1:                                                    ORA-00942: table or view does not exist                             

DROP TABLESPACE tsp1 INCLUDING CONTENTS*                                      ERROR at line 1:                       ORA-00959: tablespace 'TSP1' does not exist

DROP TABLESPACE tsp2 INCLUDING CONTENTS*                                      ERROR at line 1:ORA-00959: tablespace 'TSP2' does not exist

Tablespace created.

Tablespace created.

Table created.

165 rows created.

Commit complete.

  COUNT(*)----------       165

MIN_DATE  MAX_DATE--------- ---------01-JAN-09 14-JUN-09

Now, lets see what the plan looks like from AUTOTRACE when we run the following query to get the maximum value of COL_DATE_PART_KEY:

SQL> SET AUTOTRACE ONSQL> SELECT MAX(col_date_part_key) min_date  2  FROM   test                             3  /                                     

MIN_DATE---------14-JUN-09

Execution Plan----------------------------------------------------------Plan hash value: 784602781                                

---------------------------------------------------------------------------------------------| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |---------------------------------------------------------------------------------------------|   0 | SELECT STATEMENT     |      |     1 |     9 |    99   (0)| 00:00:02 |       |       ||   1 |  SORT AGGREGATE      |      |     1 |     9 |            |          |       |       ||   2 |   PARTITION RANGE ALL|      |   132 |  1188 |    99   (0)| 00:00:02 |     1 |    12 ||   3 |    TABLE ACCESS FULL | TEST |   132 |  1188 |    99   (0)| 00:00:02 |     1 |    12 |---------------------------------------------------------------------------------------------

Note-----   - dynamic sampling used for this statement

Statistics----------------------------------------------------------          0  recursive calls          0  db block gets        320  consistent gets         51  physical reads          0  redo size        527  bytes sent via SQL*Net to client        524  bytes received via SQL*Net from client          2  SQL*Net roundtrips to/from client          0  sorts (memory)          0  sorts (disk)          1  rows processed

SQL> SET AUTOTRACE OFF

It shows a full scan of all twelve partitions. I figured that the the plan for such a query would show a full table scan, of all partitions for that table – because, in theory, if all but the first partition were empty, then the whole table would have to be scanned to answer the query – and Oracle wouldn’t know at plan creation time, whether the data met this case, so it would have to do the full table scan to ensure the correct result.

What I thought might happen though, is that in executing the query, it would be able to short circuit things, by working through the partitions in order, from latest to earliest, and finding the first, non null, value. Once it found the first, non null, value, it would know not to continue looking in the earlier partitions, since the value of COL_DATE_PART_KEY couldn’t possibly be greater than the non null value already identified.

It doesn’t appear to have this capability, which we can check by taking one of the partitions offline and then rerunning the query, whereupon it complains that not all the data is present…

SQL> ALTER TABLESPACE tsp1 OFFLINE;

Tablespace altered.

SQL> SET AUTOTRACE ONSQL> SELECT MAX(col_date_part_key) min_date  2  FROM   test  3  /SELECT MAX(col_date_part_key) min_date*ERROR at line 1:ORA-00376: file 6 cannot be read at this timeORA-01110: data file 6: '/u01/app/oracle/oradata/T111/tsp1.dbf'

SQL> SET AUTOTRACE OFF

So, even though we know we could actually answer this question accurately, Oracle can’t do it as it wants to scan, unnecessarily, the whole table.

I did find a thread which somebody had asked about this on OTN, but all the responses were about workarounds, rather than explaining why this happens (bug/feature) or how it can be made to work in the way I, or the poster of that thread, think it, perhaps, should.

Can anyone else shed any light on this? If it’s a feature, then it seems like something that could be easily coded more efficiently by Oracle. The same issue would affect both MIN and MAX since both could be
approached in the same manner.

Installing Oracle 11gR1 on a Fedora Core 10 64 bit VM

Just a note, to myself more than anything, about what extra packages are required by a 64 bit installation of Fedora Core 10, when trying to install Oracle 11gR1.

The installation I undertook was on a FC10 64 bit VM running under VMWare Server 2.0 running on top of FC10 64 bit OS.

Tim, as usual, has a lovely guide which told me almost everything I needed to know, however the guide says “If you are performing the 64-bit installation, make sure both the 32-bit and 64-bit libraries are installed.” rather than explicitly stating the packages for a 64 bit install. Until I tried to install Oracle 11gR1, I didn’t know what these were. The Oracle installer for 11g soon told me in the pre install checks it does, so I went about installing the following packages, in order:

glibc-2.9-3.i686.rpm
libaio-0.3.107-4.fc10.i386.rpm
libgcc-4.3.2-7.i386.rpm
glibc-devel-2.9-3.i386.rpm
compat-libstdc++-33-3.2.3-64.x86_64.rpm
compat-libstdc++-33-3.2.3-64.i386.rpm
libstdc++-4.3.2-7.i386.rpm

That got me past the pre install checks of the Oracle installer and on to a successful install.

I’ve added the list to the comments on the guide Tim produced as well.

Cursor keys not working in Virtual Server 2 VM

Posted as a reminder to myself about how to fix this issue…

I couldn’t get some of the cursor keys to work properly on my virtual machines running under VMWare Virtual Server 2 on Fedora 10 x86_64. Kept giving funny behaviour like bringing up the screen capture applet!

A bit of searching the net came up with this one, which although not referring to Virtual Server 2 specifically, seems to work all the same…

Essentially, adding the line below to the following file fixes the problem

File (create it, if not already present):

~/.vmware/config

Line:

xkeymap.nokeycodeMap = true

My thanks to “The Monkey Jungle”!

Disable password ageing on Windows 2008 Server Standard

Password ageing in Windows 2008 Server Standard edition (the one I generally use) is set to automatically requre passwords to be changed after 42 days…obviously a Douglas Adams fan responsible for that bit of the codebase!

NOTE – I don’t have access to other version of Windows Server (2003 or 2008) so I can’t speak for them, but I imagine it’s the same on them too.

That’s annoying for home use, where I have tons of VMs for research that I use periodically, so I asked my brother Steve how to stop this happening and he gave me some simple instructions…

First start the Local Security Policy editor by typing secpol.msc in the start/run box..

secpol.msc

and then select Account Policies / Password Policy on the nagivation tree on the left. On the right hand side select Maximum Password Age and set this from the default of “42″ to “0″. You’ll notice it now says “Password will not expire” above the value “0″.

Set Password Ageing off.msc

Seems to have done the trick.

Kinda handy having a brother who’s an MCSE and a VCP too. Useful when I get stuck with OS or VM stuff for my Oracle research!

By the way, if anyone happens to be looking for some skilled contract resource in the Virtualisation field (VMWare, ESX Server etc…) then Steve has just become available…please feel free to contact me, or Steve, via his website

VM articles on my new Wiki

I wanted a place to store notes that I could write up from anywhere…but weren’t necessarily relevant to put in a blog, so I now have a Wiki on my website.

Don’t get excited, I’m not planning on hosting a full blown wiki for open editing – it’s just for me.

Amongst the things on there are some short “How to” articles relating to VMWare.

I’m sure I’ll have made mistakes along the way – feel free to point them out via this blog or email me and I’ll sort them out. Comments welcome as well.