Monday, June 25, 2012

Example: CROSS APPLY and sa_split_list()

Question: How do I join a table with a column?

Not "How do I use a column to join a table with another table?", but "How do I join a table with the denormalized data in a column in the same table?"
Answer: An earlier example Pushing OPENSTRING and CROSS APPLY concentrated more on how OPENSTRING worked than on the wondrous benefits of CROSS APPLY.

Here's a simpler example, a table with a string column containing a comma-separated list of names:
CREATE TABLE t (
   pkey    INTEGER NOT NULL PRIMARY KEY,
   list    LONG VARCHAR NOT NULL );

INSERT t VALUES ( 1, 'Delmar, Ainslie, Genevieve, Marlon, Jordan' );
INSERT t VALUES ( 2, 'Hunter, Nissa, Briana, Electra, Khalil' );
INSERT t VALUES ( 3, 'Inari, Fabriane, Lisette, Calista' );
COMMIT;

The builtin sa_split_list() procedure can be used to turn a string like 'Delmar, Ainslie, Genevieve, Marlon, Jordan' into a multi-row table with one name per row.
SELECT * 
  FROM sa_split_list ( 'Delmar, Ainslie, Genevieve, Marlon, Jordan' );

line_num,row_value
1,Delmar
2, Ainslie
3, Genevieve
4, Marlon
5, Jordan

The question now becomes "How do I code sa_split_list ( t.list ) in the FROM clause?"

It can't be this easy!


The answer is "use CROSS APPLY":
SELECT *
  FROM t CROSS APPLY sa_split_list ( t.list );

pkey,list,line_num,row_value
1,Delmar, Ainslie, Genevieve, Marlon, Jordan,1,Delmar
1,Delmar, Ainslie, Genevieve, Marlon, Jordan,2, Ainslie
1,Delmar, Ainslie, Genevieve, Marlon, Jordan,3, Genevieve
1,Delmar, Ainslie, Genevieve, Marlon, Jordan,4, Marlon
1,Delmar, Ainslie, Genevieve, Marlon, Jordan,5, Jordan
2,Hunter, Nissa, Briana, Electra, Khalil,1,Hunter
2,Hunter, Nissa, Briana, Electra, Khalil,2, Nissa
2,Hunter, Nissa, Briana, Electra, Khalil,3, Briana
2,Hunter, Nissa, Briana, Electra, Khalil,4, Electra
2,Hunter, Nissa, Briana, Electra, Khalil,5, Khalil
3,Inari, Fabriane, Lisette, Calista,1,Inari
3,Inari, Fabriane, Lisette, Calista,2, Fabriane
3,Inari, Fabriane, Lisette, Calista,3, Lisette
3,Inari, Fabriane, Lisette, Calista,4, Calista

OK, that's pretty ugly,
  • let's SELECT only the columns we want

  • and ORDER the result set by the name values:
SELECT sa_split_list.row_value,
       t.pkey       
  FROM t CROSS APPLY sa_split_list ( t.list )
 ORDER BY sa_split_list.row_value, 
       t.pkey;

row_value,pkey
 Ainslie,1
 Briana,2
 Calista,3
 Electra,2
 Fabriane,3
 Genevieve,1
 Jordan,1
 Khalil,2
 Lisette,3
 Marlon,1
 Nissa,2
Delmar,1
Hunter,2
Inari,3

Aha! The names aren't in perfect order because the list isn't really "comma-separated", it's "comma-space-separated".

One solution (perhaps the correct solution in some cases) would be to code TRIM ( sa_split_list.row_value ) in the SELECT list.

Another solution is to change the default separator used by sa_split_list... and while we're at it, let's give row_value a more meaningful alias name, say, "name" :)
SELECT sa_split_list.row_value AS name,
       t.pkey       
  FROM t CROSS APPLY sa_split_list ( t.list, ', ' )
 ORDER BY sa_split_list.row_value, 
       t.pkey;

name,pkey
Ainslie,1
Briana,2
Calista,3
Delmar,1
Electra,2
Fabriane,3
Genevieve,1
Hunter,2
Inari,3
Jordan,1
Khalil,2
Lisette,3
Marlon,1
Nissa,2

In the real world columns don't often contain simple lists like this... what they DO contain is giant globs of denormalized crap stuff, like, say, long strings of <TD> and </TD> pairs... so don't forget, you've got CROSS APPLY in your tool kit.

Dilbert.com



No comments: