Tuesday, March 31, 2009

Danger! The Queries are Stampeding!

Q: What determines the size of the swap file? I have one customer with a 13 GB database that has 2 GB swap file, and another customer who has a 45 GB swap file for a database that is only 6 GB.

A: A 45 GB temporary file is most probably the result of a runaway query; e.g. an ordered query returning bazillions of rows. I have seen such a query blow out a 400GB drive... it took a while, eventually it took the server down with it.

See these sections in the Help:

   SQL Anywhere Server - Database Administration
Configuring Your Database
Database options
Introduction to database options
Alphabetical list of options
...
max_temp_space option [database]
...
temp_space_limit_check option [database]
The TEMP_SPACE_LIMIT_CHECK option was introduced in SQL Anywhere 9.0.2 but the default was 'OFF' until version 10 was released. If you create your database using version 10 or 11, or even if you upgrade from 9 to 10 or 11, the value will be set to the correct value: 'ON'.



Q: What do you mean by "runaway query"? Can you give me a simple example?

A: A complex query with a missing predicate; e.g., a seven-way join with no WHERE clause to limit the result set. SQL Anywhere will happily go away and start building the 68,765,342,752,332,748,800-row result set, filling up the RAM cache and then dumping rows to the temporary file.

Yes, that's a real number, the number of rows in the result set from the following query on SQL Anywhere 11.0.1 database containing one single row in one single user table:
   SELECT *
FROM SYSDEPENDENCY,
SYSOBJECT,
SYSPROCPARM,
SYSTAB,
SYSTABCOL,
SYSVIEW,
RowGenerator
ORDER BY 1, 2, 3, 4, 5, 6, 7;
No, I didn't try running it to completion, but I did run this row count calculation:
   SELECT 1088.0 * 2692 * 1670 * 257 * 1818 * 118 * 255

68765342752332748800
I *did* let the runaway query go after doing this:
   SET OPTION PUBLIC.MAX_TEMP_SPACE = '1G';
...and got this:
   Temporary space limit exceeded
SQLCODE=-1000, ODBC 3 State="HY000"
Here's what the Foxhound monitor showed; reading upwards from the bottom, the Temp File was 336 KB before the query started, and less than a minute later (samples are taken every 10 seconds) it had grown to 874 MB. Then it started falling to 996 KB after the engine killed the query:



So, you can see that the Temp File can grow at a rate of 1G per minute, and that's on an aging laptop (3.4 GHz P4) with a feeble hard drive.

These kinds of queries often happen BY ACCIDENT when an application gives a query-building facility to the end user, and doesn't check for a missing WHERE clause when the user presses "Go!"... the user might fill in the ORDER BY but forget to specify any predicates.

The user gets tired of waiting, opens another window or goes to lunch, meanwhile the server goes WHAMMO! (crash tinkle) after a few minutes. I have even seen users open the application a second time, entered the CORRECT query, gotten an answer, and THEN gone to lunch... with the original query still running.

It's actually one of the original motivations behind adding the Monitor facility to Foxhound :)



For an earlier discussion of this topic, see Today's Client Question: Temp File Size.

No comments: