Friday, October 22, 2010

Well, don't leave us hanging... is this a bug or not?

Back on October 9 I posted "What's going on here?" about the suspicious behavior of a simple SELECT with and without a WHERE clause.

The first line said "Can something this simple really be so wrong?" and the last line said "This is a bug in SQL Anywhere 11.0.1.2276, right?"

Anonymous said...
Well, don't leave us hanging... is this a bug or not?

October 14, 2010 10:53 AM
[redacted] said...
I say it's a bug. The documentation pretty clearly states that the ORDER BY clause will be respected (and is in fact more or less required) for TOP and START AT.

Is there an index on the "data" column?

October 14, 2010 8:55 PM
Well, [redacted] is not alone. I thought it was a bug, too, until someone pointed out to me...

That's how TOP works!

TOP is evaluated after the WHERE clause has done it's thing. Long after. I should know that, I used up ten pages in my book talking about the "Logical Execution of a SELECT". That section listed 15 separate steps, with the WHERE clause coming in at Step Number 3 and the TOP clause way down at Step Number 12.

Oh, and ORDER BY is back at Step Number 9.

In my defense, the actual SELECT I was working with was way more complex than the one described in "What's going on here?"... but I still wasted several hours over two days.

Still don't get it?

Don't feel bad, you haven't spent two days thinking about it like I did.

And not everyone can be Mark C. or V Barth or Phil, who all got it right, right away...
Mark C. said...
Breck: If there are five rows in the "top 10" of original query where data != 'B' then these five rows would not be included in the second query's result set and this would cause the output of the second query to be different.

October 9, 2010 11:21 AM
V Barth said...
Oh, a riddle?

I suggest there are 5 rows in ther pkey range 1-9 WHERE data = 'B' IS NOT TRUE. Therefore the rows 10-14 are still part of the 2nd select but get skipped by the START AT clause as they are now among the first 9 rows.

October 10, 2010 11:27 AM
Phil said...
What's the data for pkey 1 through 9? I'm not an expert, but if there are 5 rows with data != 'B', then I would expect this result. I would read this query as "give me rows 10 through 15 that meet these conditions" not "give me the rows that meet these conditions from rows 10 through 15."

October 11, 2010 11:45 AM
If you got fooled into thinking it was a bug in SQL Anywhere, it might be because rows 1 through 9 were left out of "What's going on here?". I did that on purpose so you could be misled just like I was.

Here's the full demo...

CREATE TABLE t (
pkey INTEGER NOT NULL DEFAULT AUTOINCREMENT PRIMARY KEY,
data VARCHAR ( 10 ) NOT NULL );

INSERT t VALUES ( 1, 'A' );
INSERT t VALUES ( 2, 'A' );
INSERT t VALUES ( 3, 'A' );
INSERT t VALUES ( 4, 'A' );
INSERT t VALUES ( 5, 'A' );
INSERT t VALUES ( 6, 'B' );
INSERT t VALUES ( 7, 'B' );
INSERT t VALUES ( 8, 'B' );
INSERT t VALUES ( 9, 'B' );
INSERT t VALUES ( 10, 'B' );
INSERT t VALUES ( 11, 'B' );
INSERT t VALUES ( 12, 'B' );
INSERT t VALUES ( 13, 'B' );
INSERT t VALUES ( 14, 'B' );
INSERT t VALUES ( 15, 'B' );
INSERT t VALUES ( 16, 'B' );
INSERT t VALUES ( 17, 'B' );
INSERT t VALUES ( 18, 'B' );
INSERT t VALUES ( 19, 'B' );
INSERT t VALUES ( 20, 'B' );
COMMIT;

SELECT TOP 5 START AT 10 *
FROM t
ORDER BY pkey;

pkey,data
10,'B'
11,'B'
12,'B'
13,'B'
14,'B'

SELECT TOP 5 START AT 10 *
FROM t
WHERE data = 'B'
ORDER BY pkey;

pkey,data
15,'B'
16,'B'
17,'B'
18,'B'
19,'B'

My new excuse? There were 22 million rows in the actual table. That's my story, and I'm sticking to it.

No comments: