Tuesday, March 31, 2009

Danger! The Queries are Stampeding!

Q: What determines the size of the swap file? I have one customer with a 13 GB database that has 2 GB swap file, and another customer who has a 45 GB swap file for a database that is only 6 GB.

A: A 45 GB temporary file is most probably the result of a runaway query; e.g. an ordered query returning bazillions of rows. I have seen such a query blow out a 400GB drive... it took a while, eventually it took the server down with it.

See these sections in the Help:

   SQL Anywhere Server - Database Administration
Configuring Your Database
Database options
Introduction to database options
Alphabetical list of options
max_temp_space option [database]
temp_space_limit_check option [database]
The TEMP_SPACE_LIMIT_CHECK option was introduced in SQL Anywhere 9.0.2 but the default was 'OFF' until version 10 was released. If you create your database using version 10 or 11, or even if you upgrade from 9 to 10 or 11, the value will be set to the correct value: 'ON'.

Q: What do you mean by "runaway query"? Can you give me a simple example?

A: A complex query with a missing predicate; e.g., a seven-way join with no WHERE clause to limit the result set. SQL Anywhere will happily go away and start building the 68,765,342,752,332,748,800-row result set, filling up the RAM cache and then dumping rows to the temporary file.

Yes, that's a real number, the number of rows in the result set from the following query on SQL Anywhere 11.0.1 database containing one single row in one single user table:
ORDER BY 1, 2, 3, 4, 5, 6, 7;
No, I didn't try running it to completion, but I did run this row count calculation:
   SELECT 1088.0 * 2692 * 1670 * 257 * 1818 * 118 * 255

I *did* let the runaway query go after doing this:
...and got this:
   Temporary space limit exceeded
SQLCODE=-1000, ODBC 3 State="HY000"
Here's what the Foxhound monitor showed; reading upwards from the bottom, the Temp File was 336 KB before the query started, and less than a minute later (samples are taken every 10 seconds) it had grown to 874 MB. Then it started falling to 996 KB after the engine killed the query:

So, you can see that the Temp File can grow at a rate of 1G per minute, and that's on an aging laptop (3.4 GHz P4) with a feeble hard drive.

These kinds of queries often happen BY ACCIDENT when an application gives a query-building facility to the end user, and doesn't check for a missing WHERE clause when the user presses "Go!"... the user might fill in the ORDER BY but forget to specify any predicates.

The user gets tired of waiting, opens another window or goes to lunch, meanwhile the server goes WHAMMO! (crash tinkle) after a few minutes. I have even seen users open the application a second time, entered the CORRECT query, gotten an answer, and THEN gone to lunch... with the original query still running.

It's actually one of the original motivations behind adding the Monitor facility to Foxhound :)

For an earlier discussion of this topic, see Today's Client Question: Temp File Size.

Tuesday, March 24, 2009

ALTER TABLE and the Audit Trail

Possibly the most common schema change made to an existing table is to add a column. For example, here's an ALTER TABLE to add a column called display_order:

Here's the original table:
CREATE TABLE DBA.category ( -- 6 rows, 56k total = 8k table + 0 ext + 48k index
category /* PK */ VARCHAR ( 50 ) NOT NULL,
abbreviation /* U */ VARCHAR ( 10 ) NOT NULL,
display_category /* U */ VARCHAR ( 100 ) NOT NULL,
category ),
abbreviation ),
display_category )
But what if all data changes made to the category table are already being recorded in the logged_category table, as described in Revisited: Triggering an Audit Trail?

Old rows in logged_category won't look like the rows in category any more, they will be missing the new column.

The solution is surprisingly simple: after running the ALTER TABLE on category, but before running any more INSERT, UPDATE or DELETE operations on that table, add the column to the logged_category table:
ALTER TABLE logged_category ADD display_order BIGINT;
In keeping with the design of the log tables all constraints are omitted from the logged columns. In this situation that means all existing rows in logged_category will have NULL values in the new display_order column... probably a good choice since that column did not exist when those rows were inserted.

No other changes are required. In particular, no changes to the audit trail trigger are required because
  • INSERT WITH AUTO NAME automatically matches SELECT list items with the INSERT table column names by name, and

  • SELECT * automatically generates a column name list containing all the columns in the category table.
You may have heard people shout "Never Use Select Star!"... that may be good advice when you're coding a SQL query in a language like Java or VB, but it's unnecessarily limiting when you're coding application logic in SQL itself.

Here is the full schema modification script including UPDATE statements to fill in the new column. Exactly one line of code (the ALTER TABLE logged_category) has been added to deal with the fact that an audit trail is being maintained:
ALTER TABLE category        ADD display_order BIGINT NOT NULL DEFAULT 0;
ALTER TABLE logged_category ADD display_order BIGINT;

UPDATE category SET display_order = 60 WHERE category = 'BOOKS';
UPDATE category SET display_order = 40 WHERE category = 'COLLECTIBLES';
UPDATE category SET display_order = 30 WHERE category = 'GIFTWARE';
UPDATE category SET display_order = 50 WHERE category = 'LINENS & TEXTILES';
UPDATE category SET display_order = 20 WHERE category = 'POTTERY & GLASS';
UPDATE category SET display_order = 10 WHERE category = 'SILVERWARE';

SELECT * FROM logged_category ORDER BY log_id;
Here's the output of the SELECT showing the history of all changes to the category table; NULL values in display_order for old rows, non-NULL for the new rows:

For the record, here's what the tables look like after the change, plus the (unchanged) audit trail trigger:
CREATE TABLE DBA.category ( -- 6 rows, 56k total = 8k table + 0 ext + 48k index
category /* PK */ VARCHAR ( 50 ) NOT NULL,
abbreviation /* U */ VARCHAR ( 10 ) NOT NULL,
display_category /* U */ VARCHAR ( 100 ) NOT NULL,
display_order BIGINT NOT NULL DEFAULT 0,
category ),
abbreviation ),
display_category )

CREATE TABLE DBA.logged_category ( -- 18 rows, 24k total = 8k table + 0 ext + 16k index
log_id /* PK */ UNSIGNED BIGINT NOT NULL DEFAULT autoincrement,
logged_action VARCHAR ( 50 ) NOT NULL
CONSTRAINT ASA84 CHECK ( logged_action in(
'after INSERT','before UPDATE','after UPDATE','before DELETE' ) ),
logged_at TIMESTAMP NOT NULL DEFAULT timestamp,
category VARCHAR ( 50 ) NULL,
abbreviation VARCHAR ( 10 ) NULL,
display_category VARCHAR ( 100 ) NULL,
display_order BIGINT NULL,
log_id )
FYI, to create a new audit trail trigger for a different base table, all that's required is to edit all occurrences of 'category':
ALTER TRIGGER "log_category" 
NEW AS new_category
DECLARE @current_timestamp TIMESTAMP;



INSERT logged_category
SELECT 'after INSERT' AS logged_action,
FROM new_category;


SET @current_timestamp = CURRENT TIMESTAMP;

INSERT logged_category
SELECT 'before UPDATE' AS logged_action,
@current_timestamp AS logged_at,
FROM old_category;

INSERT logged_category
SELECT 'after UPDATE' AS logged_action,
@current_timestamp AS logged_at,
FROM new_category;


INSERT logged_category
SELECT 'before DELETE' AS logged_action,
FROM old_category;



Monday, March 23, 2009

Three New Tools for SQL Anywhere

1. SQL Anywhere Monitor

The SQL Anywhere Monitor (not to be confused with the MobiLink Monitor) is a web browser-based administration tool that provides you with information about the health and availability of SQL Anywhere databases and MobiLink servers. It ships with the SQL Anywhere 11.0.1 maintenance upgrade and is described here.

2. Database Workbench Pro

Database Workbench is an established development environment that has recently been extended to support SQL Anywhere versions 9, 10, 11. It is described here, a trial edition is available for download here and a 75% discount is available here until March 28.

3. QweryBuilder

QweryBuilder is a new product that includes support for SQL Anywhere as well. It is described here and a trial version is available here.

Saturday, March 21, 2009

Rotating Database Backups Revisited

When two people say they like an article, like they have for Rotating Database Backups, it's a banner day for this blog!

So, it behooves me to report a bug: If the database isn't running but the batch file keeps on getting launched, it will carry on rotating folders! The oldest backup folder is deleted, the other folders are renamed... and because the database isn't running the dbbackup utility doesn't write anything to the newest backup folder.

Eventually (if the database stays offline), you end up with no backups at all!

Rotate This, You %@&^@?!

The fix involves waiting until after the backup finishes successfully before renaming and deleting folders. The dbbackup utility writes the backup files to a folder with a temporary name, and nothing else is touched if dbbackup fails.

A couple of other changes have been made as well:
  • SQL Anywhere Version 11 is being used instead of Version 9, and

  • 10 backup generations are kept instead of just three, showing how easy it is to change the number of generations.
Why 10 generations? Because the code presented here came from a real client who wanted full backups every hour starting at 9AM and ending at 6PM. Nightly Norton Ghost runs take care of making multi-generational copies of the entire disk so the history of backups crosses day boundaries.
(... plus, every single insert, update and delete is logged at the table level (in addition to the transaction log level) using the technique described in Revisited: Triggering an Audit Trail... but that's drifting off topic.)
Much of the original article Rotating Database Backups still applies, especially where it talks about setting up a Windows scheduled task, and why I prefer batch files over CREATE EVENT for making backups.

Also, the first ECHO command has been fixed from this
ECHO ********************************************^
to this
ECHO ********************************************^
The original syntax caused the first ">" character to be ignored, which in turn caused the ECHO command to overwrite the file instead of appending.

Even the extended demo is the same, except you have to run it 11 times before you see a folder deleted. I am not going to repeat the demo here; except for being a lot longer, the diagnostic text file dbbackup_log.txt looks the same.

Here's the updated code for run_dbbackup_full.bat...
Note: The ^ character is the line continuation character for Windows command files.
REM -c ...  database connection string
REM -b ... maximum block size in pages
REM -d backup the database file only, not the transaction log.
REM -k ... checkpoint log option: copy, nocopy, recover, auto (default)
REM -l live backup
REM -n rename the backup log file to YYMMDDXX.LOG
REM -o ... where to put dbbackup console messages
REM -r rename the current log to YYMMDDXX.LOG and start a new one
REM Note: It is safe to delete the old YYMMDDXX.LOG files.
REM -s use an internal BACKUP statement to create the backup
REM Note: With -s the -o filespec is relative to the server,
REM no progress messages or prompts are displayed, and
REM the server must have permission to write the output.
REM -t backup the transaction log only, not the database file.
REM -x delete the current log and start a new one
REM -y create output folder and overwrite file(s) without confirmation
REM bkup\generation_temp temporary folder name for latest backup

MD bkup
ECHO ********************************************^
ECHO Full dbbackup started >>bkup\dbbackup_log.txt
DATE /T >>bkup\dbbackup_log.txt
TIME /T >>bkup\dbbackup_log.txt

REM Backup to empty subfolder
CD bkup
RD /S /Q generation_temp
MD generation_temp
CD ..
-c "ENG=ddd11;DBN=ddd11;UID=dba;PWD=sql"^
-o bkup\dbbackup_log.txt -x -y bkup\generation_temp

REM Backup OK
CD bkup
RD /S /Q generation1
RENAME generation2 generation1
RENAME generation3 generation2
RENAME generation4 generation3
RENAME generation5 generation4
RENAME generation6 generation5
RENAME generation7 generation6
RENAME generation8 generation7
RENAME generation9 generation8
RENAME generation10 generation9
RENAME generation_temp generation10
CD ..
DATE /T >>bkup\dbbackup_log.txt
TIME /T >>bkup\dbbackup_log.txt
DIR /S bkup\*.* >>bkup\dbbackup_log.txt
ECHO Full dbbackup OK >>bkup\dbbackup_log.txt
DATE /T >>bkup\dbbackup_log.txt
TIME /T >>bkup\dbbackup_log.txt

REM Backup failed
DATE /T >>bkup\dbbackup_log.txt
TIME /T >>bkup\dbbackup_log.txt
DIR /S bkup\*.* >>bkup\dbbackup_log.txt
ECHO Error: Full dbbackup failed >>bkup\dbbackup_log.txt


Here's what the dbbackup_log.txt file contains when a backup doesn't work because the database isn't running:
Full dbbackup started
05:07 AM
SQL Anywhere Backup Utility Version
No database file specified
05:07 AM
Error: Full dbbackup failed

Friday, March 20, 2009

Egosurfing, Google Autofill Style

I don't know when Google started offering the worldwide "search autofill" feature, and maybe I'm the last person on earth to notice it, but I still think it's great.

Here's how it works: On the main "logo" page, when you start typing a search string, a dropdown appears showing other people's searchs and the hit counts:

Type in another letter, and the whole dropdown changes:

The important point: Those are other people's searches, not just a history of my own.

Plus, it's fast, at least on a high-speed line. So fast, at first I thought the Google rich client software had already downloaded all the world's search strings to my hard drive, ahead of time.

Yeah, silly of me. But I had to pull the network cable to see for sure. Yup, it needs the network to download search strings to match brand-new searches like "kumquat"... but it does remember old sets when the browser is offline.

Strangely, it only seems to work on the logo page, not the Google results page. Yes, it works for the Firefox 3 toolbar search, but no, not for the IE7 toolbar search.

Regardless of all that, it's worth using... it really helps sometimes when you're trying to pick a search string.

What was that about "egosurfing"?

Egosurfing used to mean asking the simple question, "How many hits do I get when I google my own name?"

Now it means, "How many letters do I have to type, before my name shows up in the Google search autofill?"

Apparently you're not famous until the number drops to 4. For me, it's seven, not counting the space...

Tuesday, March 17, 2009

Which database am I connected to?

Different database management systems encourage different behaviors, yielding different answers to the question "How many databases do I have access to?"

With products like Oracle, Microsoft SQL Server, Sybase ASE and MySQL the answer tends to be "Just one"... there is a certain administrative difficulty involved in creating multiple databases and starting multiple database engines. For example, when a developer sits down to write and test some code, the development tools connect to one single test database, and often the same database is shared with other developers. It is fairly rare for a developer to create a local copy of the test (or production) database, and certainly not multiple copies running on multiple local engines.

SQL Anywhere encourages a different behavior by making it insanely easy to copy databases and start engines. The data is stored in operating system files, not raw "devices", that's one big difference from some other products. Copying a database is as simple as stopping the engine and doing a drag-and-drop on the *.db file.

Another difference is that the physical database files are binary compatible across otherwise totally incompatible operating systems. For example, you can create a database on Vista, copy it to Red Hat, and then to Solaris... or Windows Mobile. Along the way, the machines can change between little endian and big endian. Endian differences are bigger than driving on the right or left, they're like narrow gauge railways versus wide gauge, leaded versus unleaded, completely incompatible... but not for SQL Anywhere files.

It is not unusual for SQL Anywhere's answer to the question "How many databases do I have access to?" to be "Dozens... hundreds... I'm starting to lose track."

OK, I'm exaggerating... please note the careful use of "tends to be" and "not unusual". But it's true, with SQL Anywhere you can switch back and forth between multiple databases, multiple versions many times a day. For example, when testing an idea, say for a blog post like this, I create and start a completely new SQL Anywhere database (one doubleclick, ten seconds) so that my SQL will run on a "clean slate"... no leftover tables from other work, and so on. I will save the script but delete the database when I'm done.

There's a dark side to this: it's sometimes just too easy to connect to the wrong database. Especially when the connections are indirect, through ODBC DSNs, program code, connection pools and so on. I sometimes fire up Foxhound just to find out which database I'm connected to; which server? what file? what's the IP address?

Here's how Foxhound answers those questions:

The server and database name lines tell you something you probably already know, but in this case something more; i.e., it's a High Availability setup, the arbiter is ok but one of mirror servers is down, and the Primary is server2:

     Server name: ENG=fares  (Primary is server2; Arbiter is connected; Secondary is disconnected) 
Database name: DBN=fares
Then there's the name of the machine where the SQL Anywhere server is running, plus its IP address and port:
    Machine name: TSUNAMI 
dbping result: Connected to server over TCPIP at address
Finally, here's the physical database file that's being used for this connection:
                  File  Free Space  Free Space 
DBSPACE Name Size In File On Drive Fragments File Specification
system 3.1M 12k 93.9G 2 C:\data\mirror\fares.db
Foxhound works for ODBC DSNs, but it's only telling you about its own connection to the database. If you're using the same DSN from the same computer, fine... but what about an application program? Which database is it connected to?

That's what this little procedure is for. Plop it in your database, add the following query to your code, and display the long or short string in your program's "About box" (or even the window title bar):
   SELECT concise, verbose FROM which_database();

It doesn't do exactly what Foxhound does (it doesn't work with SQL Anywhere 5.5, for example) but it probably shows you what you need to know about your current connection. Here's what that SELECT returns for a dbisql connection to the same database shown above:
   Server name fares, Database fares on TSUNAMI at using C:\data\mirror\fares.db

Server name: ENG=fares (Primary is server2; Arbiter is connected; Secondary is disconnected)
Database name: DBN=fares
Machine name: TSUNAMI
Connection: Network TCPIP to
Database File: C:\data\mirror\fares.db
Here's what it returns for a connection to a local database via SharedMemory:
   Server name ddd11, Database ddd11 on PAVILION2 at local SharedMemory using C:\projects\$SA_templates\ddd11.db

Server name: ENG=ddd11
Database name: DBN=ddd11
Machine name: PAVILION2
Connection: local SharedMemory
Database File: C:\projects\$SA_templates\ddd11.db
Here's a TCP/IP connection to the same local database:
   Server name ddd11, Database ddd11 on PAVILION2 at local using C:\projects\$SA_templates\ddd11.db

Server name: ENG=ddd11
Database name: DBN=ddd11
Machine name: PAVILION2
Connection: local TCPIP to
Database File: C:\projects\$SA_templates\ddd11.db
Here's the code (Caveat: If you're using an earlier version of SQL Anywhere, you'll have to change this code. For example, PROPERTY ( 'TcpIpAddresses' ) was added in Version 11, and that's one of the reasons Foxhound uses dbping instead):
verbose LONG VARCHAR )
'Server name ',
PROPERTY ( 'Name' ),
', Database ',
DB_PROPERTY ( 'Name' ),
' on ',
PROPERTY ( 'MachineName' ),
' at ',
THEN PROPERTY ( 'TcpIpAddresses' )
' using ',
DB_PROPERTY ( 'File' ) ) AS concise,
' Server name: ENG=',
PROPERTY ( 'Name' ),
IF PROPERTY ( 'Name' ) <> PROPERTY ( 'ServerName' )
' (Primary is ',
PROPERTY ( 'ServerName' ),
'; Arbiter is ',
DB_PROPERTY ( 'ArbiterState' ),
'; Secondary is ',
DB_PROPERTY ( 'PartnerState' ),
IF DB_PROPERTY ( 'PartnerState' ) = 'connected'
THEN STRING ( ', ', DB_PROPERTY ( 'MirrorState' ), ')' )
ELSE ')'
'\x0d\x0a Database name: DBN=',
DB_PROPERTY ( 'Name' ),
'\x0d\x0a Machine name: ',
PROPERTY ( 'MachineName' ),
'\x0d\x0a Connection: ',
THEN 'Network '
CONNECTION_PROPERTY ( 'CommNetworkLink' ),
THEN STRING ( ' to ', PROPERTY ( 'TcpIpAddresses' ) )
'\x0d\x0a Database File: ',
DB_PROPERTY ( 'File' ) ) AS verbose;
You can find more information about SQL Anywhere run-time properties here:
Server-wide PROPERTY() calls

Database-specific DB_PROPERTY() calls

Connection-specific CONNECTION_PROPERTY() calls

Download "Upgrade of 11.0.0 to 11.0.1"
What's new in version 11.0.1
New Editions in SQL Anywhere 11.0.1
SQL Anywhere 11.0.1 FAQs
SQL Anywhere 11.0.1 Editions

Saturday, March 14, 2009

Broken... or worse

It's been a long time since someone has so quickly and effectively handed me my ego on a platter... long past due, some might say!

So I want to thank Geoff Dalgas, a core member the StackOverflow team, for doing the honors...

Here's the background: I have been rethinking my earlier (possibly premature?) criticism of StackOverflow, and seriously considering "taking my business" to StackOverflow... my business in this case being that of answering SQL Anywhere and MobiLink technical questions on the public NNTP forums at forums.sybase.com.

I've been doing that for many years, and I've been an active member of Team Sybase since its inception in 1993. So... plenty of time to develop an ego of planetary size.

Anyway, StackOverflow is a truly wonderful programming Q & A website built around tags rather than subforums; i.e., individual posts are manually marked with tags like "sqlanywhere" but otherwise everything's stored in one big bucket.

I just want to work with posts tagged as "sqlanywhere". There aren't a lot of those right now, but there are tens of thousands of threads in the NNTP forums so if even a fraction of the SQL Anywhere and MobiLink developers came over to StackOverflow, that would change dramatically.

I figured out how to search StackOverflow within the context of a tag, but not stay there. Not even the StackOverflow search stays within the context of a tag; you have to keep typing [sqlanywhere] in the search box or it will suddenly broaden the search.

At least, as far as I can tell... after a few hours of searching and reading and testing, and after answering a couple of questions here and here, I decided to ask my very first question on StackOverflow:

How to I get StackOverflow to treat a tag as a subforum?

I want to work within the context of a particular tag; i.e., only see messages that have been tagged with "sqlanywhere", as if SO supported subforums.

I know that I can tailor an initial StackOverflow search with a particular tag using one of the following techniques, but in each case the tag is only applied to the initial search; i.e., on a subsequent search using the SO GUI the user must re-enter the tag [sqlanywhere] or else the search suddenly broadens:
<form id="search" action="http://stackoverflow.com/search" method="get" target="_blank">
<input name="q" class="textbox" onfocus="if (this.value=='search') this.value = ''" type="text" maxlength="80" size="28" value="[sqlanywhere]">

<form id="search" action="http://stackoverflow.com/questions/tagged/'sqlanywhere'" method="get" target="_blank">
<input name="q" class="textbox" onfocus="if (this.value=='search') this.value = ''" type="text" maxlength="80" size="28" value="">
In any event, I want the pseudo-subforum "tag localization" to extend beyond searching.

O frabjous day! Callooh! Callay!

Within two minutes, Geoff Dalgas sprang into action! ... on a Saturday morning no less!

Geoff marked my message as "closed" and tagged it as "belongs on uservoice", no other comment.

A bit of digging revealed that uservoice is the place where people post StackOverflow enhancement requests.


I had no intention of asking for an enhancement to StackOverflow, not after using it for only a few hours. That would be hugely presumptuous, coming from a n00b like me.

I just asked how to do something, and if the answer is "you can't" then that's OK too.
Well... I guess I did get an answer, in a rude and roundabout way.
Sure, I could argue against the "closed"... if that's even possible for a n00b with no "reputation"... or I could go ahead and post the question on the uservoice forum, but that's not important.

Here's what is...

Important Lesson Learned

If Geoff Dalgas was just some random dude, no big deal, the interweb is full of people like him. But he's not some random dude, he represents StackOverflow.

And through him, StackOverflow is saying "Breck Carter is not welcome! Go away!"

Quickly, effectively, thoroughly... here's the lesson I've learned: Don't tell friends to ask questions on StackOverflow.

The SQL Anywhere and MobiLink developer community is filled to the brim with serious professionals. Everyone who answers questions on the NNTP forums (Sybase tech support, iAnywhere Solutions engineers and outsiders like me) are unfailingly gentle and polite when dealing with even the simplest of questions. The developers are used to that level of service, and they would be shocked and horrified at the treatment waiting for them at StackOverflow.


Sure... Geoff might think so. But remember, he's not just a community member, not just a moderator, he's at the very heart of StackOverflow, and that's important.

Thank you, Geoff Dalgas, for saving me from spending any more time learning about StackOverflow, and from embarrassing myself with friends and clients.


I did have a second question for StackOverflow all lined up ready to go:
How do I create a Google Custom Search Engine that searches within a tag on StackOverflow?

The StackOverflow search is nice, but Google it ain't.

For example, a Google search on the word subforum will also find subforums and sub-forum, but apparently StackOverflow will not.
Google web search is great, we all know that. What isn't so great, at least for someone with my abilities, is making it work well with websites like StackOverflow.

Sure, it's easy to set up a Google Custom Search Engine to search all of StackOverflow, but how do I specify an url pattern that will limit the search to posts tagged with "sqlanywhere"? These patterns don't work:
I'm sure there is an answer, maybe someone will comment here ( not gonna ask on StackOverflow, ego still healing :)

Why StackOverflow?

Some time ago, in the search for true happiness, I lowered my expectations about developer support forums. I now only have two requirements: First, the forums have to be fast, and second, they have to be searchable by Google.

Once upon a time, the NNTP forums satisfied those requirements. NNTP client software like Forte Agent was blindingly fast, and with the immediate feed from the Sybase NNTP server to Google, the Google Groups search facility brought the power of Google search to bear.

No more. Forte Agent is still fast, but Google Groups search is broken... and that's the main reason I was revisiting StackOverflow.

Here's the story on Google Groups search, told by example...

Never mind that the following search only finds two matches; after all, 2003 was a long time ago, and the feed from forums.sybase.com to Google Groups has been inoperative from time to time.

No, the problem here is quite different: Google Groups is returning one hit from an entirely different group than the one requested:

The "wrong groups" problem occurs randomly. Sometimes Google Groups searches in the domain range sybase.* return hits in groups entirely outside that range.

The following Google Groups date range search doesn't find any posts in January 2009:

... even though quite a few exist:

... and even though other Google Groups searches can find January posts.

But those are minor issues compared to the problems with exact phrase searches. The following Google Groups searches were formed by plucking various phrases from actual NNTP posts made over several years. None of these searches find anything:
"download acknowledgement disabled" group:sybase.public.sqlanywhere.mobilink

"use named row parameters" group:sybase.public.sqlanywhere.mobilink

"unable to calculate the timestamp precision" group:sybase.public.sqlanywhere.mobilink
In each case, Google Groups does have matching messages and can display them via other searches, it just can't find them using simple exact phrase searches.

Here's an interesting case; this Google Groups search doesn't find anything: "run syncase.sql" group:sybase.public.sqlanywhere.mobilink

But this search does work, and does successfully find the post that contains the longer phrase "run syncase.sql": "syncase.sql" group:sybase.public.sqlanywhere.mobilink

And this search works too; go figure: "run syncase" group:sybase.public.sqlanywhere.mobilink

Yup, broken... or worse

Google Groups search is broken, which means Google Groups is broken... useless... untrustworthy.

And StackOverflow is run by people like this.

My search for happiness continues.

Monday, March 9, 2009

MobiLink Xtreme Edition

If you're using SQL Anywhere 11's version of the MobiLink synchronization server, you may have good reason to upgrade to 11.0.1.

That's because in addition to adding support for Microsoft SQL Server 2008 at the consolidated database side of things, they have added a SACKLOAD of logging and monitoring options.

First of all, there is the venerable "verbosity" option, mlsrv11 -v.

You can use mlsrv11 -v+ for maximum verbosity, or, wait for it, you can code everything:

mlsrv11 -vcefhimnopqrstu
Here's the full list of mlsrv11 -v codes:
+   Turn on all logging options that increase verbosity.
c Show scripts.
e Show system event scripts.
f Show first-read errors.
h Show the remote schema.
i Show uploaded column values.
m Show the duration of each synch phase.
n Show row-count summaries.
o Show SQL passthrough activity.
p Show progress offsets.
q Show downloaded column values.
r Show uploaded and downloaded column values.
s Show script names.
t Show the translated SQL scripts.
u Show undefined table scripts.
The dreaded -vr option (show all column values) has been supplemented by two new options that will help cut the size of the mlsrv11 -o text file: -vi just shows uploaded column values, and -vq just the downloaded values. In many situations the volume of uploaded data is much lower than downloaded data, and it is the uploaded data that's often more interesting for debugging, so -vi is going to be popular.

But it gets better: you can turn on mlsrv11 verbose logging for individual sync sessions started by specific remote databases. Here's an excerpt from this Help topic:
Log verbosity for targeted MobiLink users and remote IDs

The MobiLink server can be set to use different log verbosity for a targeted MobiLink user or remote ID. The MobiLink server checks the ml_property table every five minutes and looks for verbose settings for a MobiLink user or remote ID. If verbose settings exist, then it uses the new setting to log output messages for the given MobiLink user or remote ID. This enables you to see the details for a specific user or remote ID without the need for high verbosity settings that would negatively impact the server farm, and without requiring a restart of each server in the farm.

To set maximum verbosity for a targeted MobiLink user, for example ml_user1, log into the consolidated database and issue the following SQL command:
call ml_add_property( 'MLS', 'ml_user_log_verbosity', 'ml_user1', '-v+' )
To set maximum verbosity for a targeted remote ID, for example rid_1, log into the consolidated database and issue the following SQL command:
call ml_add_property( 'MLS', 'ml_remote_id_log_versity', 'rid_1', '-v+' )
The mlsrv11 -vm option (show the duration of each synch phase) is also new. Here's the short description from the read-me file (there's more detail in this Help topic):
===============(Release Build  - Engineering Case #532825)===============

A new logging option has been added to the MobiLink server. The -vm command
line option will cause the server to print to the log the duration of each
sync and the duration of each sync phase whenever a sync completes. The sync
phases are the same as those displayed in the MobiLink monitor. Each value
is prefixed with "PHASE: " to aid searching for and printing the values.

Sample output follows:
I. 2008-06-05 14:48:36. <1> PHASE: start_time: 2008-06-05 14:48:36.048
I. 2008-06-05 14:48:36. <1> PHASE: duration: 175
I. 2008-06-05 14:48:36. <1> PHASE: sync_request: 0
I. 2008-06-05 14:48:36. <1> PHASE: receive_upload: 19
I. 2008-06-05 14:48:36. <1> PHASE: get_db_worker: 0
I. 2008-06-05 14:48:36. <1> PHASE: connect: 18
I. 2008-06-05 14:48:36. <1> PHASE: authenticate_user: 51
I. 2008-06-05 14:48:36. <1> PHASE: begin_sync: 69
I. 2008-06-05 14:48:36. <1> PHASE: apply_upload: 0
I. 2008-06-05 14:48:36. <1> PHASE: prepare_for_download: 1
I. 2008-06-05 14:48:36. <1> PHASE: fetch_download: 4
I. 2008-06-05 14:48:36. <1> PHASE: wait_for_download_ack: 0
I. 2008-06-05 14:48:36. <1> PHASE: end_sync: 0
I. 2008-06-05 14:48:36. <1> PHASE: send_download: 10
I. 2008-06-05 14:48:36. <1> PHASE: get_db_worker_for_download_ack: 0
I. 2008-06-05 14:48:36. <1> PHASE: connect_for_download_ack: 0
I. 2008-06-05 14:48:36. <1> PHASE: nonblocking_download_ack: 0
But wait, there's more... the mlsrv11 -ppv 60 option turns on the "print (log) periodic monitor value" feature every 60 seconds. Here's what the read-me says (again, there's more detail in this Help topic):
===============(Release Build  - Engineering Case #536143)===============

A new MobiLink server command line option, -ppv <period>, has been added
which will cause the server to print new, periodic monitoring values every <period>
seconds. The suggested period is 60s.

----- Note: You code it as -ppv 60, not -ppv 60s -----

If the period is set too small, the
log will grow very quickly. These values provide insight into the state of
the server, and are useful for determining the health and performance of
the MobiLink server. For example, one could look at the DB_CONNECTIONS and
LONGEST_DB_WAIT values to look for potential problems with the -w option
or in the synchronization scripts. They also provide an easy way to track
system wide throughput measures such as the number of rows uploaded or downloaded
per second and the number of successful syncs per second. Each row of output
is prefixed with "PERIODIC:" to aid searching for and filtering out the values.

The following table contains a description of the printed values:
Value Description
TCP_CONNECTIONS Number of TCP connections currently opened
PAGES_USED Number of cache pages used
PAGES_LOCKED Number of cache pages loaded into memory
TCP_CONNECTIONS_OPENED Total number of connections ever opened
TCP_CONNECTIONS_CLOSED Total number of connection ever closed
TCP_CONNECTIONS_REJECTED Total number of connection ever rejected
TCP_BYTES_READ Total number of bytes ever read
TCP_BYTES_WRITTEN Total number of bytes ever written
ML_NUM_CONNECTED_CLIENTS Number of connected sync client
PAGES_SWAPPED_OUT Total number of pages ever swapped to disk
PAGES_SWAPPED_IN Total number of pages ever read from disk
PAGES_IN_STREAMSTACK Number of pages held by the network streams
CPU_USAGE Amount of CPU time used by MobiLink server in microseconds
NUM_COMMITS Total number of commits
NUM_ROLLBACKS Total number of rollbacks
NUM_SUCCESS_SYNCS Total number of successful syncs
NUM_FAILED_SYNCS Total number of failed syncs
NUM_ERRORS Total number of errors
NUM_WARNINGS Total number of warnings
DB_CONNECTIONS Number of database connections in use
RAW_TCP_STAGE_LEN Length of the network work queue
STREAM_STAGE_LEN Length of the high level network processing queue
HEARTBEAT_STAGE_LEN Length of the queue for periodic, non-sync work
CMD_PROCESSOR_STAGE_LEN Length of the queue for sync work
NUM_ROWS_DOWNLOADED Total number of rows sent to remotes
NUM_ROWS_UPLOADED Total number of rows received from remotes
FREE_DISK_SPACE Disk space available on the temp disk in bytes
LONGEST_DB_WAIT Longest length of time an active sync has been waiting
for the database
LONGEST_SYNC Age of the oldest sync in ms
NUM_UNSUBMITTED_ERROR_RPTS Number of unsubmitted error reports
MEMORY_USED Bytes of RAM in use. Windows only.
SERVER_IS_PRIMARY 1 if the server is primary; 0 otherwise
Here's a "MobiLink Xtreme" version of the command line:
"%SQLANY11%\bin32\mlsrv11.exe" -c "DSN=cons;UID=dba;PWD=sql" -fr -o mlsrv11_log_cons.txt -os 10M -ppv 60 -vcefhimnopqrstu -zu+ 

Download "Upgrade of 11.0.0 to 11.0.1"
What's new in version 11.0.1
New Editions in SQL Anywhere 11.0.1
SQL Anywhere 11.0.1 FAQs
SQL Anywhere 11.0.1 Editions

Friday, March 6, 2009

Sybase IQ Makes Top Ten TPC-H

Sybase IQ Version 15 just hit the overall Number Two spot in the 3,000 GB category of Top Ten TPC-H by Price/Performance.

And it hit NUMBER ONE in the non-clustered 3000 GB category! Woohoo!

But... did you know?

The SQL Anywhere 11 engine is embedded inside IQ, and it is the SQL Anywhere engine that manages connections, transactions, recovery, cursors, locks and everything that the application sees, in coordination with IQ where necessary. In all of these matters, SQL Anywhere acts as the primary manager and involves IQ as a secondary participant. Another way to look at it is that IQ acts as a tightly coupled and specialized store with it's own optimizer, data store, transaction manager and lock manager.

SQL Anywhere manages the database catalog for everything in the SQL Anywhere and IQ data stores, and it manages the database objects themselves except for IQ tables, indexes and dbspaces.

SQL Anywhere parses the SQL queries, controls execution of stored procedures and issues each query to the IQ engine... in fact, SQL Anywhere performs some query rewrite optimizations before passing the queries to the IQ engine for proper optimization (enumeration of plans) and execution.

Here's where you can read about the TPC-H benchmark:

Glenn Paulley's blog

TPC-H Results

Top Ten Non-Clustered TPC-H by Price/Performance ... scroll down to the "3,000 GB Results" section:

Disclaimer: The word "embedded" is used in the Marketecture sense, in that SQL Anywhere's role in Sybase IQ is hidden from the customer before purchase; the Sybase IQ product page and whitepaper make no mention of it. This hidden or "stealth" role is proudly played by SQL Anywhere in a large number of products, Sybase IQ being one of them. The folks at iAnywhere don't take credit for the TPC-H results, but they certainly deserve credit for making IQ a much better product.

Danger! Domains!

Q: Why is the seconds portion of a smalldatetime column dropped when I insert a value to the database? According to the SQL Anywhere documentation, smalldatetime values are actually timestamps. Is this a bug, or intentional?

A: It's not a bug in SQL Anywhere itself, especially since SQL Anywhere does not drop seconds. You may be seeing a client-side or application program or interface issue.

Here's the story: There is no such thing as the "smalldatetime" data type in SQL Anywhere, in spite of the fact the Help calls it a "data type". There is something called the smalldatetime domain, and it is there for compatibility with Transact SQL.

If you are not seeing seconds, then one of two things is probably happening: Your code is truncating the value before storing it, or your code is truncating the value when retrieving it. By "your code" I mean anything that is not inside the database engine; it could be the interface software, the development tools, the class library you're using, whatever.

Something is pretending that smalldatetime is actually small, and it's not SQL Anywhere :)

Here is a test using SQL Anywhere 11.0.1 and dbisql. It shows that smalldatetime timestamps are stored with seconds, and that by default they are returned with seconds to three decimal places. It also shows you can use TIMESTAMP_FORMAT to change that to six decimal places:


INSERT t VALUES ( 1, '2009-03-05 06:04:58.123456' );


1,'2009-03-05 06:04:58.123'

SET TEMPORARY OPTION TIMESTAMP_FORMAT = 'yyyy-mm-dd hh:nn:ss.ssssss';


1,'2009-03-05 06:04:58.123456'
The important point is that SQL Anywhere stores what it receives.

The confusion surrounding domains (I hate 'em, never use 'em) is exactly the reason Foxhound shows the true data types:
-- DBA.t (table_id 704) in ddd11 - Mar 5 2009 6:17:58AM - Print - Foxhound © 2007 RisingRoad

CREATE TABLE DBA.t ( -- empty
now TIMESTAMP NOT NULL, -- domain smalldatetime
pkey )
The smalldatetime domain comes buit-in to SQL Anywhere. You can create domains yourself, but when you do that, you are buying nothing good and a whole lot that is bad. Domains hide the true data types from the application programmer, leading to confusion and bugs because SQL works with the base data types, not domains.

Here's an example:
CHECK ( LENGTH ( @funky ) = 5 );

funky_data funky );

INSERT t VALUES ( 1, '123' );
The INSERT gets rejected with the following message: Constraint 'ASA77' violated: Invalid value for column 'funky_data' in table 't'.

The definition of a domain can't be altered in any meaningful way, and it can't be dropped and recreated after it's been used for any column even if all the tables are empty. That means any benefit of abstraction is absent from domains.

It's better to bite the bullet and explicitly define all the attributes for each and every column in your database. That brings one extra benefit: You can specify constraint names that make runtime errors more meaningful.

For example, you can eliminate the useless 'ASA77' in the message above, replacing it with instructions telling the end users what they should do, not just what they did wrong:
funky_data VARCHAR ( 5 ) NOT NULL DEFAULT '00000'
CONSTRAINT "Use exactly 5 characters" CHECK ( LENGTH ( @funky ) = 5 ));

INSERT t VALUES ( 1, '123' );
Now the INSERT gets rejected with this message: Constraint 'Use exactly 5 characters' violated: Invalid value for column 'funky_data' in table 't'

You can't specify the CONSTRAINT clause in a CREATE DOMAIN statement. You also can't use the new DROP DOMAIN IF EXISTS syntax. Both of those could be added to the syntax in the future, but IMO iAnywhere should not bother... domains are a bad idea in the first place.

I wish I could do DROP DOMAIN smalldatetime.

PS This is not an argument against having smalldatetime as a real data type. It's an argument against leading people to believe it exists when it doesn't.

Wednesday, March 4, 2009

Old News

Rhetorical question: Why does MySybase.com show a SYS-CON blog posting from February 26 as the only entry in its "Latest Blog Entries", when there are several of Sybase's own blogs with later content?

Sure, it says "Mon, 02 Mar 2009" but in reality that's a fake post, one of many copies of the original created every few hours.

Could the answer be in the games people play?

There are even Sybase blog postings later than the most recent SYS-CON fake, yet they don't show up on MySybase:


Oracle has CREATE OR REPLACE PROCEDURE, now so does SQL Anywhere 11.0.1.

...plus these:

I've never understood why Oracle doesn't have CREATE OR REPLACE TABLE, and for some inexplicable reason, neither does SQL Anywhere 11.0.1 (if it's some kinda AnsiNazi rule, just shoot me now).

What SQL 11.0.1 Anywhere does have (and Oracle 11g doesn't) is
...plus these:
SQL Anywhere 11.0.1 also has this:
Hmmmm... let me think... the d00dz in Waterloo don't do anything without a good reason... after all, their motto has always been
WATCOM does things the way they should be done!
Here is Scenario 1: You want make sure a table exists, but if it already exists you want to leave it alone:
Here's Scenario 2: You want to create a table from scratch, regardless of whether or not it already exists:

Now it makes sense; CREATE OR REPLACE TABLE would only do Scenario 2... no need to shoot me :)

Download "Upgrade of 11.0.0 to 11.0.1"
What's new in version 11.0.1
New Editions in SQL Anywhere 11.0.1
SQL Anywhere 11.0.1 FAQs
SQL Anywhere 11.0.1 Editions

Tuesday, March 3, 2009

SQL Anywhere 11.0.1

The upgrade of SQL Anywhere to 11.0.1 is being released today. It's got something for everyone...

First, the Marketecture news: three New Editions in SQL Anywhere 11.0.1:

The new Workgroup, Standard, and Advanced editions simplify the selection and purchase of SQL Anywhere for server environments while continuing to offer the same packaging for customers building embedded, desktop, mobile and remote solutions, and for OEMs who redistribute SQL Anywhere with their own applications. In particular,
  • Workgroup Edition continues to target existing customers using SQL Anywhere in server environments

  • Standard Edition addresses the growing number of customers needing guaranteed reliability and performance

  • Advanced Edition offers enterprises a departmental server that can be easily integrated into established IT environments
For more details:
SQL Anywhere 11.0.1 FAQs

SQL Anywhere 11.0.1 Editions
Lots of Tarchitecture news too: What's new in version 11.0.1

Top 10 Server Configuration Tips for a New SQL Anywhere 11 Installation

  1. Install software.

  2. Look busy.

  3. Look busy.

  4. Look busy.

  5. Look busy.

  6. Look busy.

  7. Look busy.

  8. Look busy.

  9. Look busy.

  10. Start server.