bustle in your hedgerow: Linking 2 tables based on row positions

A simple (and common??) scenario - you have 2 tables of ordered integers and want to combine them into one recordset where the first row of the first table is paired with the first row of the second table and so on, like

Table1
val(int)
---
1
3
5
9

Table2
val(int)
---
4
9
14
17

we want to return

val val
--- ---
1   4
3   9
5   14
9   17

And we want to achieve this WITHOUT the use of cursors (obviously!).

If we could assign a row number to each table then we can join on the row number, row1 = row1, row2=row2, etc... now SQL 2005 gives us a couple of ways of doing this using either the RANK() function or the ROW_NUMBER() function.

if object_id('tempdb..#t1') is not null drop table #t1
if object_id('tempdb..#t2') is not null drop table #t2

create table #t1 (val int)
create table #t2 (val int)

insert into #t1 values (1)
insert into #t1 values (3)
insert into #t1 values (5)
insert into #t1 values (9)

insert into #t2 values (4)
insert into #t2 values (9)
insert into #t2 values (14)
insert into #t2 values (17)

using RANK()

select t3a.val, t3b.val
from
(select val, rank() over (order by val) as seq from #t1) t3a
join
(select val, rank() over (order by val) as seq from #t2) t3b
on t3a.seq=t3b.seq

or using ROW_NUMBER()

select t3a.val, t3b.val
from
(select val, row_number() over (order by val) as seq from #t1) t3a
join
(select val, row_number() over (order by val) as seq from #t2) t3b
on t3a.seq=t3b.seq

But in SQL2000 these functions do not exist. An elegant way to achieve this is to join the table to itself and determine the 'rank' by counting the number of elements that are less than or equal to the current i.e.

select t11.val, count(*) as seq
from #t1 t11, #t1 t12 where t11.val>=t12.val group by t11.val

returns

val seq
--- ---
1   1
3   2
5   3
9   4

hence we can use this as the ranking function

select t3a.val, t3b.val
from
(select t11.val, count(*) as seq
from #t1 t11, #t1 t12 where t11.val>=t12.val group by t11.val) t3a
join
(select t21.val, count(*) as seq
from #t2 t21, #t2 t22 where t21.val>=t22.val group by t21.val) t3b
on t3a.seq=t3b.seq

[Credit goes to Richard Dean for the original idea]

bustle in your hedgerow

Tuesday, 5 February 2008

Linking 2 tables based on row positions - SQL2000 vs SQL2005

No comments: