## Wednesday, February 29, 2012

### Example: Working With RAND()

Question: How do I generate random integers in the range 1 through 6?

Answer: Oh, you mean "How do I use SQL Anywhere to generate random numbers in the range 1 through 6?"

Well, you can call RAND()... which gives you a number between 0 and 1, but doesn't every actually give you a zero or a one, and take it from there.

### Gee, thanks for all the help...

OK, OK, let's take it a bit further... let's call EXPRTYPE() to see what RAND() returns:
```SELECT EXPRTYPE ( 'SELECT RAND()', 1 ) AS "data type";

data type
double
```

Great... RAND() returns a honking big floating point number.

Well, what do they look like? Let's use sa_rowgenerator() to call RAND() ten times:
```SELECT RAND() AS random FROM sa_rowgenerator ( 1, 10 );

random
0.7364153851458874
0.9333781469303082
0.28651545768907083
0.46529738021329853
0.25306924490866683
0.33479917996320835
0.969817641642791
0.7251030903892141
0.8076401715202444
0.00836274074779951
```

Wonderful! That's so far away from 1, 2, 3, 4, 5, 6 it's not even funny!

### Why does everything have to be so hard?

Sigh... let's get started... multiply by 6 and add 1, to change the interval from "0 to 1" to "1 to 7":
```SELECT ( RAND() * 6 ) + 1 AS random
FROM sa_rowgenerator ( 1, 10 );

random
4.5213047105452535
5.568270134072876
3.916143362836979
4.621499201106605
3.5370729987216523
5.585889514808492
6.045075386317947
1.5820178457452068
2.9739334396896573
2.899320864071753
```

Much better... now the results are in the range 1-point-something to 6-point-something, but don't forget: it's never exactly 1 or exactly 7.

Now let's use CAST to drag the values down to 1 through 6, where the values can sometimes be exactly 1 or exactly 6:
```SELECT CAST ( ( RAND() * 6 ) + 1 AS INTEGER ) AS random
FROM sa_rowgenerator ( 1, 10 );

random
3
1
4
3
3
6
2
3
4
6
```

MUCH better... now let's see if it generates the same number of each digit 1 through 6.

Let's generate a million numbers and count how many of each digit we get:
```SELECT @@VERSION,
random              AS digit,
COUNT(*)            AS actual_occurrences,
( 1000000 / 6.0 )   AS perfect_occurrences
FROM ( SELECT CAST ( ( RAND() * 6 ) + 1 AS INTEGER ) AS random
FROM sa_rowgenerator ( 1, 1000000 )
) AS generated
GROUP BY random
ORDER BY random;

@@VERSION,digit,actual_occurrences,perfect_occurrences
'12.0.1.3298',3,1000000,166666.67
```

Oops, that's not right.

Try it again and you get a million fours. Or a million ones.

But just a single digit, not an even distribution across all 6 digits.

What we want is 166,666 ones, 166,666 twos, and so on... like we (almost) get in SQL Anywhere Version 9:
```@@VERSION,digit,actual_occurrences,perfect_occurrences
'9.0.2.3951',1,166070,166666.67
'9.0.2.3951',2,166786,166666.67
'9.0.2.3951',3,166835,166666.67
'9.0.2.3951',4,166467,166666.67
'9.0.2.3951',5,166944,166666.67
'9.0.2.3951',6,166898,166666.67
```

### This Is Why We Test...

...to make sure our crap stuff actually works.

Somewhere, deep down inside the SQL Anywhere query engine, the promise that RAND() "is treated as a non-deterministic function" is being broken... perhaps it's because of all the GROUP BY and COUNT(*) stuff that was added to the SELECT.

Or maybe it's a feature... in this case, it doesn't matter, we just need it to work.

And here's how: Bite the bullet, save the million RAND() results in a simple temporary table, then run the query:
```SELECT RAND() AS random
INTO #generated
FROM sa_rowgenerator ( 1, 1000000 );

SELECT @@VERSION,
CAST ( ( random * 6 ) + 1 AS INTEGER )  AS digit,
COUNT(*)                                AS actual_occurrences,
( 1000000 / 6.0 )                       AS perfect_occurrences
FROM #generated
GROUP BY digit
ORDER BY digit;

@@VERSION,digit,actual_occurrences,perfect_occurrences
'12.0.1.3298',1,166612,166666.67
'12.0.1.3298',2,166700,166666.67
'12.0.1.3298',3,166727,166666.67
'12.0.1.3298',4,166496,166666.67
'12.0.1.3298',5,166801,166666.67
'12.0.1.3298',6,166664,166666.67
```

That works in all versions of SQL Anywhere from 9 through 12, and the result looks pretty good: for all intents and purposes the number of ones, twos, threes, ... are the same.

So there's the answer: Call RAND(), multiply it to expand the range, add to it to shift the range, and truncate it to get an integer.

Tip: Truncate the number, don't ROUND() it, that might generate an incorrect number of 1s and 6s. That's what happened with this expression: CAST ( ROUND ( ( RAND() * 5 ) + 1, 0 ) AS INTEGER )

### Are you sure that's random?

Scott Adams has the answer... 