Monday, July 27, 2009

Principles of Code Comments

Everybody has opinions about comments in program code; here's mine:

  • Don't include the file name in comments. Development tools always tell you what file you're editing, and it should be easy, not hard, to rename files when copying, reorganizing and so forth.

  • From the moment in time the code in a new file is "finished" (from the ever-optimistic developer viewpoint), start recording modification history.

  • Put the latest modification history comments at the top of the file, move old history comments to the bottom. You want to preserve history, but you don't want ancient history to get in the way.

  • Keep the modification history simple. Don't write more than one line per modification history comment unless absolutely necessary. The future reader needs to know three things only: When the change was made, who made the change (initials), and what changed, where the "when" might include the build number as well as the date. It is perfectly OK to copy and paste a snippet of code unique to the change; future readers will take it from there to find the details.

  • Put a "date started at" comment at the very top, and keep it there. Knowing when a file was originally created is very often interesting to the reader.

  • Don't decorate comments. No big boxes, no government fill-in-the-forms.

  • Do decorate warnings. It's even OK to shout, as in /* ----- WHEN MAKING CHANGES HERE ALWAYS CHECK FILE XYX.SQL ----- */

  • Do include separator and title lines if the code's long and tedious... this assumes you agree one long flat module is easier to maintain than breaking it up into separate files simply to keep the line count down. However, if you would always rather maintain 100 ten-line files plus 100 calls, than a single thousand-line file, carry on!

  • Don't clutter the code with embedded modification history comments, unless absolutely necessary. The "absolutely necessary" part might apply if, say, you are dealing with life-critical code, or for some other reason you need to communicate the exact details of the changes.

  • Don't write comments that rot. Comments are not compiled, they don't have to be correct. Don't increase the reader's level of mistrust with details unlikely to survive the first maintenance cycle.

  • Don't assume the reader is a n00b. If the reader *is* a n00b, it's up to them to learn the programming language, not you to teach them.

  • Don't assume the reader knows all about the framework. Frameworks are not languages. Languages help, frameworks hide.

  • Do include pointers to the docs. Not to the language Help, but to solution references and tutorials; e.g., where the original code came from, where relavent standards are found, and so on... if those are important to the code itself. URLs are great, Google search strings as well... the latter may survive link rot.
Here are some excerpts from a module that's at the very heart of the Foxhound Database Monitor; rroad_monitor_sample is called every 10 seconds for each target database to gather and save measurements about the server, database and client connections:

-- 2007 02 04 BC 1988a: Date begun.
-- (see bottom of file for earlier modification history)
-- 2009 07 16 BC 3440a: Changed: SET this_rroad_group_2_property_pivot.previous_ApproximateCPUTime = previous_rroad_group_2_property_pivot.ApproximateCPUTime,
-- 2009 07 16 BC 3440a: Added: Column temp_engine_properties.CurrentCacheSize DEC(20),
-- 2009 07 16 BC 3440a: Added: Column temp_engine_properties.MaxCacheSize DEC(20),
-- 2009 07 21 BC 3458a: Changed: ' Procedure rroad_ ... _properties not used; build number ', @procedure_build_number,
-- 2009 07 21 BC 3458a: Changed: -- Also note: The ISNUMERIC call is used for the same reason: The proxy queries return duplicate garbled rows.
-- 2009 07 21 BC 3458a: Removed: Old commented-out code.
-- 2009 07 21 BC 3459a: Changed: SET @sample_elapsed_msec = rroad_f_datediff_msec ( @sample_started_at, @sample_finished_at );
-- 2009 07 21 BC 3459a: Changed: SET @canarian_query_elapsed_msec = rroad_f_datediff_msec ( @canarian_query_started_at, @canarian_query_finished_at );

PARAMETERS MONITOR_DEBUG_MESSAGES;

-------------------------------------------------------------------------------------------------------------------------------------------------
MESSAGE STRING ( '****************************** CREATE PROCEDURE rroad_monitor_sample' ) TO CLIENT;

BEGIN
DROP PROCEDURE rroad_monitor_sample;
EXCEPTION WHEN OTHERS THEN
END;

-----------------------------------------------------------------------------------
CREATE PROCEDURE rroad_monitor_sample (
IN @sampling_id UNSIGNED INTEGER )

BEGIN

...

------------------------------------------------------------
-- Try the heartbeat query.

IF @ok = 'Y'
AND COALESCE ( @proxy_owner, '' ) <> '' THEN

BEGIN -- handle connection problem

SET @diagnostic_location = '204.a7';

------------------------------------------------------------
-- Record the sample start time immediately before the first proxy call.

SET @sample_started_at = CURRENT TIMESTAMP;
SET @sample_finished_at = @sample_started_at;

------------------------------------------------------------
-- Perform the basic heartbeat or canarian query.
-- This is done to calculate the heartbeat time as well check the connection.
-- Note: The target database can be stopped and restarted, and the monitor should just keep trying to connect.
-- Note: When Foxhound itself is is stopped and restarted, reconnection to target databases *should* be automatic.

SET @sql = STRING ( 'SELECT dummy_col INTO @dummy_col FROM ', @proxy_owner, '.proxy_DUMMY' );

SET @canarian_query_started_at = CURRENT TIMESTAMP;
SET @canarian_query_finished_at = @canarian_query_started_at;

EXECUTE IMMEDIATE @sql;

ROLLBACK; -- release locks, recover from possible failed proxy connection

...

END; -- rroad_monitor_sample

-- 2007 02 04 BC 1988a: Added: Column rroad_conn_property_name.sample _interval UNSIGNED INTEGER NOT NULL, -- 10 ultra, 10 fast, 60 slow
-- 2007 02 04 BC 1988a: Added: Column rroad_db_property_name.sample _interval UNSIGNED INTEGER NOT NULL, -- 2 ultra, 10 fast, 60 slow
-- 2007 02 04 BC 1988a: Added: Column rroad_eng_property_name.sample _interval UNSIGNED INTEGER NOT NULL, -- 2 ultra, 10 fast, 60 slow
-- 2007 02 04 BC 1988a: Added: Column rroad_sample_set.sample _interval UNSIGNED INTEGER NOT NULL DEFAULT 0,

...

-- 2009 07 11 BC 3430a: Added: Column temp_database_properties.MirrorMode LONG VARCHAR,
-- 2009 07 11 BC 3430a: Added: Column temp_database_properties.MirrorState LONG VARCHAR,
-- 2009 07 11 BC 3430a: Added: Column temp_database_properties.PartnerState LONG VARCHAR,
-- 2009 07 13 BC 3432a: Added: Column temp_database_properties.ReadOnly LONG VARCHAR,

Friday, July 24, 2009

The Stack Overflow Knowledge Exchange Platform

Here's all you really need to know about StackExchange:

"With StackExchange, you can run a site with all the same features that made Stack Overflow successful."
StackExchange is coming September 1. Pricing starts with a 45 day free trial, then $129 per month for 1,000,000 page views per month on a shared server, more $$ for more page views, and even more $$$$ if you want to host the server yourself... but the prices aren't stupidly high, not like, say, Oracle.

The StackExchange Beta is coming soon, you can sign up here... I have, maybe I'll set up a SQL Anywhere Overflow site!

Wednesday, July 22, 2009

Displaying Alerts

My previous post mentioned that the Foxhound Database Monitor displayed alerts and all clear messages "in multiple locations and formats, both separate from and together with other measurements in the main monitor display."

When I wrote that, it wasn't true... yet... but now it is.

Here's the Monitor tab of the Foxhound main menu, showing all the databases being monitored including their recent alert history. In this case, the ddd11 database has never had an alert issued, while the test10_local_on_tsunami database has had 17 different alert conditions at one time or another, out of a possible total of 28, and 14 are still in effect:



That takes care of the "separate" display of alerts and all clear messages mentioned above.

To see alerts in the context of other measurements, you can scroll the Monitor History page to find them... or just click on one of the links on the Monitor menu tab, in this case the 1m 58.5s ago: link on the ALL CLEAR for Alert #3.

That causes the following History page to open, scrolled to the very moment the ALL CLEAR was issued:



In this case there is some "on/off" oscillation of Alert #3, caused by the extremely aggressive alert criteria used (1 second for 1 sample). Aggressive criteria are fine for testing the alerts process itself, but aren't much use in real life.

But that's not the point here... the point is that takes care of the promise to display alerts "together with other measurements".

Monday, July 20, 2009

Comparing Database Monitors

When the next Beta of Foxhound is released (soon, soon), the SQL Anywhere Monitor will have some competition in the area of email alerts.

Here's how the two products compare as far as features are concerned:



Basic Features


SQL Anywhere MonitorFoxhound Database Monitor
Can monitor both SQL Anywhere and MobiLink servers.
Limited to monitoring SQL Anywhere target databases.
Limited to SQL Anywhere Version 11 target servers running Version 10 or 11 databases.
Can monitor Version 5, 6, 7, 8, 9, 10 and 11 databases and servers.
Multiple adjustable sampling intervals are supported, with a 10 second minimum. The defaults are 30 seconds, 5 minutes and 30 minutes for High, Medium and Low.
There's only one sampling interval, and it's fixed at 10 seconds.
Stores sample data in a dedicated Version 11 database.
Stores sample data in a dedicated Version 11 database.
Uses SQL Anywhere's built in HTTP server (port 4950) with an Adobe Flash browser interface.
Uses SQL Anywhere's built in HTTP server (port 80) with a standard HTML and JavaScript browser interface; e.g., select and copy to clipboard is supported.
Measurements shown via graphs that scroll to the right.
Measurements shown via text that scrolls up.
Separate tabs and graphs for separate measurements.
One combined display for all measurements.
- not supported -
Peak measurements are recorded, and measurements approaching the peaks are color highlighted.
- not supported -
Detailed measurements for each connection are displayed.
Basic information about each blocked connection is displayed: the connection numbers for the blocked and blocking connections.
Additional information about each blocked connection is displayed, including the SQL text for the blocked statement and for a SELECT statement you can run to find the locked row.
The GUI refresh interval defaults to 1 minute and cannot be set any faster than 30 seconds.
The GUI refresh interval is fixed at 10 seconds.
The GUI requires a login in addition to the user ids and passwords required to monitor to the target databases.
The Foxhound GUI itself requires no login... just the user ids and passwords for the target databases.
Detailed connection parameters (user id, password, host, server, database and/or port) must be provided for each target database.
ODBC DSNs may be chosen from the list of existing User and System DSNs. DSN-less connection strings may also be used.
Maintenance will be performed daily at [00] : [00]
The maintenance schedule is internally determined.
Scheduled backup via "Back up the SQL Anywhere Monitor data to the following directory: [ ]"
Backup is manual via desktop shortcut.
Yes/No: Take a daily average values older than [2 weeks] (or 1 week, 1 month, 6 months, 1 year)
- not supported -
Yes/No: Delete values older than [1 month] (or 1 week, 2 weeks, 6 months, 1 year)
Purge sample data: after 1 day, 1 week, 1 month, 1 year, never purge.
Yes/No: Delete old values when the total disk space becomes greater than (MB): [10240]
- not supported -
Yes/No: Allow anyone read-only access to the SQL Anywhere Monitor.
- not supported -



Alert Processing


SQL Anywhere MonitorFoxhound Database Monitor
Alerts are point-in-time events.
Alerts are conditions that go into and out of effect.
Detection of individual alert conditions can be enabled and disabled.
Detection of individual alert conditions can be enabled and disabled.
- not supported -
"All clear" messages are displayed and emails are sent when alerts are no longer in effect. Cancellation messages are also displayed when criteria change while alerts are in effect.
Alerts which are no longer in effect must be cleared manually via the "Mark Resolved" or "Delete" buttons.
Alerts are automatically cleared when the "all clear" and "cancelled" conditions are met.
Most alerts are issued as soon as the criteria are met.
A waiting period is part of the criteria for most alerts, with 10 samples being the most common default. The waiting period directly affects how long a condition must exist before an alert is issued, and indirectly affects how long the condition must remain resolved before an all clear is issued.
Alert messages are displayed on a separate tab.
Alerts and all clear messages are displayed in multiple locations and formats, both separate from and together with other measurements in the main monitor display.
Alerts will be repeatedly sent for a condition that persists, although the frequency can be controlled: Alerts for the same condition that occur with 5 minutes (the default) are suppressed. Other intervals may be chosen: 1 minute, 1 hour, 2 hours, 6 hours, 24 hours.
Alert messages are displayed and emails are sent ONLY when the alert goes into effect.
Can send alert emails via SMTP and MAPI.
SMTP only.
Sensible "factory setting" defaults are provided for all alert criteria.
Sensible "factory setting" defaults are provided for all alert criteria.
Changes to alert criteria must be made manually for each database.
Manually-entered alert criteria may be copied from one database to another via "Save Settings as Default" and "Restore Default Settings"
- not supported -
The original Foxhound alert criteria may be restored via "Restore Factory Settings".
- not supported -
The "Use Extreme Settings" button may be used to force the detection of more alerts, sooner, for testing purposes.
Yes/No: Suppress unsubmitted error report alerts from resources.
- not supported -



Alert Conditions


SQL Anywhere MonitorFoxhound Database Monitor
(a) Availability Alert - Database Down
Alert #1. Foxhound has been unable to gather samples for [1m] or longer.
- not supported -
Alert #2. The heartbeat time has been [1.0s] or longer for [10] or more recent samples.
- not supported -
Alert #3. The sample time has been [10.0s] or longer for [10] or more recent samples.
(b) Alert when CPU use reaches [90] % for two collection intervals in a row.
Alert #4. The CPU time has been [90]% or more for [10] or more recent samples.
(c) Alert when memory usage reaches [85] % of the maximum cache size.
Alert #19. The cache has reached [100] % of its maximum size for [10] or more recent samples.
- not supported -
Alert #20. The cache satisfaction (hits/reads) has fallen to [50] % or lower for [10] or more recent samples.
(d) Alert when free disk space per dbspace is less than [1024] MB on the disk.
Alert #5. The free disk space on the drive holding the main database file has fallen below [1GB].
Alert #6. The free disk space on the drive holding the temporary file has fallen below [1GB].
Alert #7. The free disk space on the drive holding the transaction log file has fallen below [1GB].
Alert #8. The free disk space on one or more drives holding other database files has fallen below [1GB].
- not supported -
Alert #13. There are [1000] or more fragments in the main database file.
(e) Alert when a connection has been blocked for longer than [10] seconds.
Alert #23. The number of blocked connections has reached [10] or more during [10] or more recent samples.
- not supported -
Alert #24. At least one single connection has blocked [5] or more other connections during [10] or more recent samples.
- not supported -
Alert #25. The number of locks has reached [1,000,000] or more during [10] or more recent samples.
(f) Alert when the number of connections in use reaches [85] % of the license limit.
Alert #26. The number of connections has reached [1000] or more for [10] or more recent samples.
(g) Alert when a query has run for longer than [10] seconds.
- not supported -
- not supported -
Alert #27. The approximate CPU time has reached [25] % of elapsed time or more for at least one connection during [10] or more recent samples.
- not supported -
Alert #28. The transaction running time has reached [1m] or more for at least one connection during [10] or more recent samples.
- not supported -
Alert #21. The total temporary file space used by all connections has been [1G] or larger for [10] or more recent samples.
- not supported -
Alert #22. At least one single connection has used [500M] or more of temporary file space during [10] or more recent samples.
(h) Alert when a connection attempt fails.
- not supported -
(i) Alert when the arbiter or partner server is disconnected.
Alert #9. The high availability target database has become disconnected from the arbiter server.
Alert #10. The high availability target database has become disconnected from the partner database.
- not supported -
Alert #11. The high availability target database server has switched over to [server2].
- not supported -
Alert #12. The high availability target database has changed from [read only] to [updatable].
(j) Alert when the number of unscheduled requests reaches [5]
Alert #14. The number of requests waiting to be processed has reached [5] or more for [10] or more recent samples.
- not supported -
Alert #15. The current number of incomplete file I/O operations has reached [10] or more for [10] or more recent samples.
- not supported -
Alert #16. There have been [1000] or more disk and log I/O operations per second for [10] or more recent samples.
- not supported -
Alert #17. The Checkpoint Urgency has been [100] % or more for [10] or more recent samples.
- not supported -
Alert #18. The Recovery Urgency has been [1000] % or more for [10] or more recent samples.

Friday, July 17, 2009

Wall-E, meet RIVA



RIVA the robot may not match Wall-E's sparkling personality, or even IvanAnywhere's, but you can't fault her mission: To reduce medication errors at the Children's Hospital of Orange County in Southern California.

Plus, like IvanAnywhere, RIVA runs on SQL Anywhere:

"To make the code lean, RIVA relies heavily on scripts stored inside a Sybase Inc. embedded database, SQL Anywhere."
- Thom Doherty, CTO at Intelligent Hospital Systems
Read more about RIVA here, and watch the video here.

Wednesday, July 15, 2009

Danger! NULLs!

It's legal to avoid taxes, but not to evade them. With NULLs, I don't care, I will avoid them, evade them, whatever it takes.

Like Al Gore does with global warming, whenever something bad happens I blame the NULLs.

And here's why: The code I posted in SELECT FROM Excel Spreadsheets was wrong! Because of NULLs!

Well, because of a stupid mistake *I* made involving NULLs. And it isn't really the NULLs' fault. Not really.

Here's the query that was wrong; can you see why?

SELECT proxy_browsers.browser          AS brand_name,
SUM ( proxy_browsers.hits ) AS hit_count,
hit_count / total_hits * 100.0 AS percent
FROM proxy_browsers
CROSS JOIN ( SELECT SUM ( hits ) AS total_hits
FROM proxy_browsers )
AS summary
WHERE browser IS NOT NULL
GROUP BY proxy_browsers.browser,
summary.total_hits
ORDER BY hit_count DESC;

And this is the most embarrassing part: I knew there was someting wrong, because that query gave a different answer from this one which used an OLAP WINDOW instead of a CROSS JOIN:
SELECT DISTINCT 
FIRST_VALUE ( browser ) OVER ( brand_window ) AS brand_name,
SUM ( hits ) OVER ( brand_window ) AS hit_count,
CAST ( hit_count
/ SUM ( hits ) OVER ( everything )
* 100.0 AS DECIMAL ( 11, 4 ) ) AS percent
FROM proxy_browsers
WHERE browser IS NOT NULL
WINDOW brand_window AS ( PARTITION BY browser ),
everything AS ()
ORDER BY hit_count DESC;

But I thought it was the OLAP query giving the wrong answer, not the CROSS JOIN! I didn't post the OLAP query because I thought there was a bug in SQL Anywhere 11!

Arrgh!

What's the answer? Last chance, the contest is closed...

Scroll down to see the answer...

down...

down...

down...

down...

down...

down...

down...

down...

down...

down...

down...

down...

down...

There are two SELECTs in the CROSS JOIN query but only one of them has the necessary WHERE browser IS NOT NULL. Here's the corrected CROSS JOIN query, which now returns the same result set as the (always correct) OLAP version:
SELECT proxy_browsers.browser          AS brand_name,
SUM ( proxy_browsers.hits ) AS hit_count,
hit_count / total_hits * 100.0 AS percent
FROM proxy_browsers
CROSS JOIN ( SELECT SUM ( hits ) AS total_hits
FROM proxy_browsers
WHERE browser IS NOT NULL )
AS summary
WHERE browser IS NOT NULL
GROUP BY proxy_browsers.browser,
summary.total_hits
ORDER BY hit_count DESC;

Jonathan's the clear winner in the Find The Mistake Contest; not only did he provide right answer, but he proposed an alternative solution:

Clearly you're anticipating the possibility that the "browser" column may be null, based on the "browser is not null" clause in the outer query. But your inner query that generates the total hits doesn't include this clause, which means that you're measuring percentages against the total number of hits, not the total of hits from non-null browsers.

This may or may not be considered a bug, depending on what you're trying to present. But if you have hits from null browsers, then your percentages will not sum to 100, and at a minimum this will look odd.

Fix is obviously to add "where browser is not null" to the subquery. A perhaps better fix would be to handle the null values, something like this:
SELECT coalesce(proxy_browsers.browser, 'Unknown') AS brand_name,
SUM ( proxy_browsers.hits ) AS hit_count,
hit_count / total_hits * 100.0 AS percent
FROM proxy_browsers
CROSS JOIN ( SELECT SUM ( hits ) AS total_hits
FROM proxy_browsers )
AS summary
GROUP BY brand_name,
summary.total_hits
ORDER BY hit_count DESC;

Monday, July 13, 2009

Techwave Agenda

This year's Techwave Symposium in Washington DC may be a shadow of former Techwaves, but it's not all bad.

First, there's the price, $200 for a day and a half of technical content, and then there's the content itself: The agenda for the SQL Anywhere and MobiLink sessions has just been published and it looks solid.

Oh, and registration is now open.

Friday, July 10, 2009

Search this blog, plus Glenn Paulley's

      ...search this blog, plus Glenn Paulley's

I've added Glenn Paulley's blog to the Google Custom Search Engine gadget at the top right of this page, and here's why:
  • It's the Number One most popular SQL Anywhere blog, plus

  • Glenn has recently increased the number of code-related posts.
November 9, 1995
Yes, actual code! Which makes searching both blogs at the same time a good idea when you're looking for SQL Anywhere technical topics.

Wednesday, July 8, 2009

Is It Safe?

That's a simple question made memorable by the 1976 movie Marathon Man:

Christian Szell: Is it safe?... Is it safe?
Babe Levy: You're talking to me?
Christian Szell: Is it safe?
Babe Levy: Is what safe?
Christian Szell: Is it safe?
Babe Levy: I don't know what you mean. I can't tell you something's safe or not, unless I know specifically what you're talking about.
Christian Szell: Is it safe?
Babe Levy: Tell me what the "it" refers to.
Christian Szell: Is it safe?
Babe Levy: Yes, it's safe, it's very safe, it's so safe you wouldn't believe it.
Christian Szell: Is it safe?
Babe Levy: No. It's not safe, it's... very dangerous, be careful.
In my case, the question was this:
Is it safe to call GET_IDENTITY ( 'table-name', 0 )?
The SQL Anywhere function call GET_IDENTITY ( 't', 1 ) pre-allocates the next value that would normally be assigned by an INSERT statement to the DEFAULT AUTOINCREMENT column in t, and returns that value to the caller. This is very useful if you need to know what a new AUTOINCREMENT primary key is going to be, before you INSERT the row.

You can pass GET_IDENTITY other numbers, like 2, 3, ..., to have it pre-allocate multiple values and return you the first value.

The Help doesn't talk about passing it zero, but that's what I wanted it to do: Just tell me what the next value is going to be, but don't pre-allocate it... let the next INSERT use it.

Actually, what I really wanted was the last value assigned, which I could get by subtracting:

GET_IDENTITY ( 'table-name', 0 ) - 1

In other words, give me the current AUTOINCREMENT value, the one that was last assigned to some particular table. This is different from @@IDENTITY in two ways:
  • @@IDENTITY returns the last AUTOINCREMENT value assigned to any table, whereas GET_IDENTITY() lets you specify which table.

  • @@IDENTITY only returns values assigned by the current connection, whereas GET_IDENTITY() doesn't care what connection made the assignment. In other words, @@IDENTITY remembers the last value assigned by the current connection, whereas GET_IDENTITY() will return values assigned by other connections.
That last point is one you should consider carefully. If it's important to you, you may be better off calling GET_IDENTITY ( 't', 1 ) before doing the INSERT, because the value that is pre-allocated by GET_IDENTITY ( 't', 1 ) is protected from work done by other connections, and you are safe to specify it in the INSERT.

However, you may be looking for a faster alternative to SELECT MAX ( t.c ), which may be slow because:
  • there's no index on the DEFAULT AUTOINCREMENT column, or

  • there's an index but it's not useful because the DEFAULT AUTOINCREMENT column is not the first column.
It turns out that yes, it is safe to call GET_IDENTITY with zero in the second argument. Here's some code that shows how GET_IDENTITY ( 'table-name', 0 ) - 1 returns the same value as SELECT MAX ( t.c ):

CREATE TABLE t ( c INTEGER DEFAULT AUTOINCREMENT );

SELECT GET_IDENTITY ( 't', 0 ) - 1,
MAX ( t.c )
FROM t;

INSERT t VALUES ( DEFAULT );

SELECT GET_IDENTITY ( 't', 0 ) - 1,
MAX ( t.c )
FROM t;

INSERT t VALUES ( DEFAULT );

SELECT GET_IDENTITY ( 't', 0 ) - 1,
MAX ( t.c )
FROM t;

Here are the three result sets; note that the first SELECT returns NULLs because there are no rows in t yet:

GET_IDENTITY('t',0)-1,MAX(t.c)
(NULL),(NULL)

GET_IDENTITY('t',0)-1,MAX(t.c)
1,1

GET_IDENTITY('t',0)-1,MAX(t.c)
2,2

Monday, July 6, 2009

Foxhound Sends Email Alerts

Fans of Foxhound will appreciate that the next beta of the database monitor will be getting "Email Alerts".

That's where you tell Foxhound what worries you the most about your SQL Anywhere server, and Foxhound tells you when your worst fears come true.

By email.

Like when your database goes offline.

Or the CPU goes above 90% for an hour.

Or when fifty users can't get any work done because their database connections are all blocked.

Here's an example of an alert email:



It's been surprisingly difficult to implement email alerts in a useful way; there's more to it than detecting anomalies and sending emails:

  • Letting the administrator specify criteria for each different alert condition.

  • Specifying how soon to actually issue an alert after the condition is first detected.

  • Deciding how soon to issue an "all clear" after the condition is no longer detected.

  • Deciding what to do when the administrator changes the criteria while an alert is in effect.

  • Displaying the alerts and all clears on the Database Monitor page, when email isn't enough.

  • Displaying an error message when Foxhound can't send an email.

  • Doing all this without making the administrator fill in endless forms.
Here's an example of an alert, followed by an all clear, displayed on the Foxhound Database Monitor page: