Two new patches have been posted recently on the Foxhound 4 page:
[ Download Patch 3 ] |
Fix "unable to load dll" messages: Could not load dynamic library Unable to load either rroad4.dll or rroad464.dll
[ Patch 3 Readme ]
|
[ Download Patch 4 ] |
Fix Display Schema for billion-row tables: Value nnn out of range for destination [ Patch 4 Readme ]
|
[ Download Patch 5 ] |
NEW Prevent Foxhound from becoming unresponsive when the Monitor cannot connect to a network target database and an endless loop of bogus error messages appears: Attempt to reload definition for event "rroad_monitor_sample_loop" failed due to concurrent event execution. [ Patch 5 Readme ]
|
[ Download Patch 8 ] |
Prevent Foxhound from becoming unresponsive when Ping-Only Sampling is stopped and restarted, or when full sampling is attempted for an arbiter server. [ Patch 8 Readme ]
|
[ Download Patch 10 ] |
NEW Prevent Foxhound from becoming unresponsive on startup, especially when the Foxhound database is very large. [ Patch 10 Readme ]
|
|
The two new patches are more likely to help Foxhound users than the others:
Patch 5 will help folks who are using recent builds of SQL Anywhere 16 and 17 to run Foxhound. In particular, it implements a workaround for a
bug behavior change introduced
after SQL Anywhere 16.0.0.2193 and SQL Anywhere 17.0.0.1359.
Patch 10 will help anyone with a Foxhound database that has grown into the multi-gigabyte range by decreasing the time it takes for Foxhound to start up. For example, here are the test results for a 1.4G database:
Before applying Patch 10: Foxhound startup time: 10.8s
After: Foxhound startup time: 1.1s
|
That's a factor of 10,
but... you might not notice a difference of only 9 seconds.
However, you
will notice the effect with a larger Foxhound database. For example, here's what happened with a 35G database:
Before applying Patch 10: Foxhound startup time: 7m 17s ...plus 30 minutes more for the Foxhound Menu to appear!
After: Foxhound startup time: 4.6s
|
If you see this message every time you start Foxhound 4, you probably need to apply Patch 10:
Foxhound was not available yet, probably because the Foxhound engine had not finished starting up.
It may be ready now, so click here to try again.
If the situation persists contact RisingRoad.
|
If that message goes away after a long while, replaced by a blank page, and you still don't see the Foxhound menu appear, you
definitely need Patch 10...
especially if Foxhound's instance of dbsrv16.exe is using up this much CPU:
The "Foxhound startup time: 7m 17s" only tells part of the story in the example above... after Foxhound finished starting up, it took a further 30 minutes before the menu page appeared!
How Patch 10 Was Created...
Step 1: Find the Queries From Hell
The first step was to start SQL Anywhere's Procedure Profiler as soon as Foxhound started executing, by adding this statement to DatabaseStart event script:
CALL sa_server_option ( 'ProcedureProfiling', 'YES' );
|
Then, after the Foxhound menu finally appeared, this dbisql query showed which SQL statements took the longest to execute:
SELECT *
FROM sa_procedure_profile()
where millisecs >= 100
ORDER BY object_name, line_num;
|
Here's the row that shows line 102 in the rroad_monitor_list_html stored procedure took 473,628 milliseconds to run, which is 8 minutes:
object_type, object_name, owner_name, table_name, line_num, executions, millisecs, percentage, foreign_owner, foreign_table
'F', 'rroad_monitor_list_html', 'DBA', , 102, 1, 473628, 99.9972553051784, ,
|
Here's how to find line 102 that stored procedure:
Tip: You can't use Sybase Central to view the source code because the version it displays (SYSPROCEDURE.source) isn't the version that executes (SYSPROCEDURE.proc_defn), so you have use an Old School query.
UNLOAD SELECT proc_defn
FROM SYSPROCEDURE
WHERE proc_name = 'rroad_monitor_list_html'
TO 'C:\temp2\\temp_rroad_monitor_list_html.txt'
DELIMITED BY '' ESCAPES OFF HEXADECIMAL OFF QUOTES OFF;
|
Here's the Query From Hell at line 102:
select "rroad_alert_union"."sampling_id" as "sampling_id",
"COUNT"(distinct "rroad_alert_union"."alert_number") as "active_alert_count",
"LIST"(
distinct "STRING"(
'#',
"rroad_alert_union"."alert_number"),
', ' order by
"rroad_alert_union"."alert_number" asc) as "active_alert_number_list"
into local temporary table "active_alert_count"
from "rroad_alert_union"
where "rroad_alert_union"."record_type" = 'Alert'
and "rroad_alert_union"."alert_is_clear_or_cancelled" = 'N'
group by "rroad_alert_union"."sampling_id";
|
Step 2: Study the Query Plans From Hell
The next step was to copy and paste the Query From Hell into dbisql and click on Tools - Plan Viewer.
In this case, the "Main Query" didn't look too bad, just one table scan for 77K rows, but the subquery called "SubQ 5" was truly breathtaking... a 77K table scan repeated 77K times!
SubQ 5 was found lurking down inside a view, and there was no apparent way to make it run faster; no indexes helped, nor did a materialized view:
LEFT OUTER JOIN ( SELECT *,
'Y' AS alert_is_clear
FROM rroad_all_clear
) AS rroad_all_clear
ON rroad_all_clear.sampling_id = rroad_alert.sampling_id
AND rroad_all_clear.alert_number = rroad_alert.alert_number
AND rroad_all_clear.alert_in_effect_at = rroad_alert.alert_in_effect_at
|
The only solution was eliminate SubQ 5 altogether by adding a new derived (redundant) column to one of the tables.
Step 3: Fix the Queries From Hell
Patch 10 actually introduces two new derived columns, for two separate Queries From Hell; here's what the code looks like.
The first derived column...
ALTER TABLE rroad_alert
ADD alert_is_clear_or_cancelled VARCHAR ( 1 ) NOT NULL DEFAULT 'N';
|
allowed this slow query code...
CREATE VIEW rroad_alert_union AS
SELECT CAST ( 'Alert' AS VARCHAR ( 20 ) ) AS record_type,
...
COALESCE (
rroad_all_clear.alert_is_clear,
rroad_alert_cancelled.alert_is_cancelled,
'N' ) AS alert_is_clear_or_cancelled
FROM rroad_alert
INNER JOIN rroad_sampling_options
ON rroad_sampling_options.sampling_id = rroad_alert.sampling_id
LEFT OUTER JOIN ( SELECT *,
'Y' AS alert_is_clear
FROM rroad_all_clear
) AS rroad_all_clear
ON rroad_all_clear.sampling_id = rroad_alert.sampling_id
AND rroad_all_clear.alert_number = rroad_alert.alert_number
AND rroad_all_clear.alert_in_effect_at = rroad_alert.alert_in_effect_at
LEFT OUTER JOIN ( SELECT *,
'Y' AS alert_is_cancelled
FROM rroad_alert_cancelled
) AS rroad_alert_cancelled
ON rroad_alert_cancelled.sampling_id = rroad_alert.sampling_id
AND rroad_alert_cancelled.alert_number = rroad_alert.alert_number
AND rroad_alert_cancelled.alert_in_effect_at = rroad_alert.alert_in_effect_at
|
to be simplified (and speeded up) by eliminating both outer join subqueries (including SUBQ 5)...
CREATE VIEW rroad_alert_union AS
SELECT CAST ( 'Alert' AS VARCHAR ( 20 ) ) AS record_type,
...
rroad_alert.alert_is_clear_or_cancelled AS alert_is_clear_or_cancelled
FROM rroad_alert
INNER JOIN rroad_sampling_options
ON rroad_sampling_options.sampling_id = rroad_alert.sampling_id
|
Two new triggers were required to maintain the new column:
CREATE OR REPLACE TRIGGER tri_rroad_all_clear
AFTER INSERT ON rroad_all_clear
REFERENCING NEW AS new_rroad_all_clear
FOR EACH ROW
BEGIN
UPDATE rroad_alert
SET rroad_alert.alert_is_clear_or_cancelled = 'Y'
WHERE rroad_alert.sampling_id = new_rroad_all_clear.sampling_id
AND rroad_alert.alert_number = new_rroad_all_clear.alert_number
AND rroad_alert.alert_in_effect_at = new_rroad_all_clear.alert_in_effect_at;
END;
CREATE OR REPLACE TRIGGER tri_rroad_alert_cancelled
AFTER INSERT ON rroad_alert_cancelled
REFERENCING NEW AS new_rroad_alert_cancelled
FOR EACH ROW
BEGIN
UPDATE rroad_alert
SET rroad_alert.alert_is_clear_or_cancelled = 'Y'
WHERE rroad_alert.sampling_id = new_rroad_alert_cancelled.sampling_id
AND rroad_alert.alert_number = new_rroad_alert_cancelled.alert_number
AND rroad_alert.alert_in_effect_at = new_rroad_alert_cancelled.alert_in_effect_at;
END;
|
Patch 10 contains another derived column...
ALTER TABLE rroad_sampling_options
ADD latest_sample_set_number UNSIGNED BIGINT NOT NULL DEFAULT 0;
|
which allowed this slow query code...
LEFT OUTER JOIN ( SELECT rroad_group_1_property_pivot.*
FROM rroad_group_1_property_pivot
INNER JOIN ( SELECT rroad_group_1_property_pivot.sampling_id AS sampling_id,
MAX ( rroad_group_1_property_pivot.sample_set_number ) AS sample_set_number
FROM rroad_group_1_property_pivot
GROUP BY rroad_group_1_property_pivot.sampling_id
) AS latest_primary_key
ON latest_primary_key.sampling_id = rroad_group_1_property_pivot.sampling_id
AND latest_primary_key.sample_set_number = rroad_group_1_property_pivot.sample_set_number
) AS rroad_group_1_property_pivot
ON rroad_group_1_property_pivot.sampling_id = rroad_sampling_options.sampling_id
|
to be simplified (and speeded up) by eliminating the complex inner join subquery:
LEFT OUTER JOIN rroad_group_1_property_pivot
ON rroad_group_1_property_pivot.sampling_id = rroad_sampling_options.sampling_id
AND rroad_group_1_property_pivot.sample_set_number = rroad_sampling_options.latest_sample_set_number
|
Two existing triggers had to be modified to maintain the second derived column:
ALTER TRIGGER tri_rroad_sample_set
AFTER INSERT ON rroad_sample_set
REFERENCING NEW AS new_rroad_sample_set
FOR EACH ROW
BEGIN
...
UPDATE rroad_sampling_options
SET rroad_sampling_options.latest_sample_set_number = new_rroad_sample_set.sample_set_number
WHERE rroad_sampling_options.sampling_id = new_rroad_sample_set.sampling_id
AND rroad_sampling_options.latest_sample_set_number < new_rroad_sample_set.sample_set_number;
...
END; -- tri_rroad_sample_set
ALTER TRIGGER tru_rroad_sample_set
BEFORE UPDATE OF sample_finished_at ON rroad_sample_set
REFERENCING OLD AS old_rroad_sample_set NEW AS new_rroad_sample_set
FOR EACH ROW
BEGIN
...
IF new_rroad_sample_set.sample_set_number <> old_rroad_sample_set.sample_set_number THEN
UPDATE rroad_sampling_options
SET rroad_sampling_options.latest_sample_set_number = new_rroad_sample_set.sample_set_number
WHERE rroad_sampling_options.sampling_id = new_rroad_sample_set.sampling_id
AND rroad_sampling_options.latest_sample_set_number < new_rroad_sample_set.sample_set_number;
END IF;
...
END; -- tru_rroad_sample_set
|