Previously on . . . The story began with Documenting DATEDIFF and continued with two episodes about the use and abuse of DATEDIFF in this blog and in Foxhound.
Now the story turns to DATEDIFF examples published in the SQL Anywhere 16 Help.
The Scoring System
- FAIL means the DATEDIFF usage is badly flawed; it shouldn't have been coded that way, and it should be fixed.
- LUCKY means the DATEDIFF usage may be flawed but it doesn't matter given the data values involved.
- OK means the DATEDIFF usage is OK given the data types involved.
The Scores
Example 1 is OK: begin_poll event
Let's ignore the obvious problems with this code (the "Syntax error near DISTINCT", the unknown nature of the tables in the FROM clause, and so on) and assume that "datediff( hour, last_status_change, now()" refers to a valid TIMESTAMP value in last_status_change.
This example creates a push request for a Notifier named Notifier A. It uses a SQL statement that inserts rows into a table named PushRequest. Each row in this table represents a message to send to an address. The WHERE clause determines which push requests are inserted into the PushRequest table.
To use the ml_add_property system procedure with a SQL Anywhere consolidated database, run the following command:
ml_add_property( 'SIS', 'Notifier(Notifier A)', 'begin_poll', 'INSERT INTO PushRequest (gateway, mluser, subject, content) SELECT ''MyGateway'', DISTINCT mluser, ''sync'', stream_param FROM MLUserExtra, mluser_union, Dealer WHERE MLUserExtra.mluser = mluser_union.name AND (push_sync_status = ''waiting for request'' OR datediff( hour, last_status_change, now() ) > 12 ) AND ( mluser_union.publication_name is NULL OR mluser_union.publication_name =''FullSync'' ) AND Dealer.last_modified > mluser_union.last_sync_time' );
Guess what? Assuming that "> 12" means "thirteen full hours or more" then DATEDIFF HOUR works OK because unlike other forms of DATEDIFF it returns the number of HOUR units between the two timestamps, not the number of HOUR boundaries.
Here's proof using two timestamps that have 13 hour boundaries between them but differ by an interval of only 12:00:00.000001 (which DATEDIFF HOUR truncates to 12):
BEGIN DECLARE @last_status_change TIMESTAMP DEFAULT ( '2013 09 02 00:59:59.999999' ); DECLARE @now TIMESTAMP DEFAULT ( '2013 09 02 13:00:00.000000' ); SELECT DATEDIFF ( HOUR, @last_status_change, @now ); END; DATEDIFF(hour,@last_status_change,@now) 12
Example 2 is a FAIL: Trigger conditions for events
Assuming Notification.last_notification is a TIMESTAMP column, what does "DATEDIFF( minute, last_notification, current timestamp )" return?
Notify an administrator of a possible attempt to break into the database:
CREATE EVENT SecurityCheck TYPE ConnectFailed HANDLER BEGIN DECLARE num_failures INT; DECLARE mins INT; INSERT INTO FailedConnections( log_time ) VALUES ( CURRENT TIMESTAMP ); SELECT COUNT( * ) INTO num_failures FROM FailedConnections WHERE log_time >= DATEADD( minute, -5, current timestamp ); IF( num_failures >= 3 ) THEN SELECT DATEDIFF( minute, last_notification, current timestamp ) INTO mins FROM Notification; IF( mins > 30 ) THEN UPDATE Notification SET last_notification = current timestamp; CALL xp_sendmail( recipient='DBAdmin', subject='Security Check', "message"= 'over 3 failed connections in last 5 minutes' ) END IF END IF END;
Answer: It returns the number of MINUTE boundaries between the two timestamp values. This usage must be marked as a FAIL because DATEDIFF MINUTE can return two different answers for exactly the same interval depending on the number of MINUTE boundaries between the two timestamps.
BEGIN DECLARE @last_notification TIMESTAMP DEFAULT ( '2013 09 02 00:59:59.999999' ); DECLARE @current_timestamp TIMESTAMP DEFAULT ( '2013 09 02 01:30:00.000000' ); SELECT DATEDIFF ( MINUTE, @last_notification, @current_timestamp ); END; BEGIN DECLARE @last_notification TIMESTAMP DEFAULT ( '2013 09 02 01:00:00.000000' ); DECLARE @current_timestamp TIMESTAMP DEFAULT ( '2013 09 02 01:30:00.000001' ); SELECT DATEDIFF ( MINUTE, @last_notification, @current_timestamp ); END; DATEDIFF(minute,@last_notification,@current_timestamp) 31 DATEDIFF(minute,@last_notification,@current_timestamp) 30DATEDIFF MINUTE can also return a smaller answer for a longer interval. Here's proof; the first DATEDIFF MINUTE call returns 31 for an interval of 00:30:00.000001 while the second returns 30 for an interval of 00:30:00.999999:
BEGIN DECLARE @last_notification TIMESTAMP DEFAULT ( '2013 09 02 00:59:59.999999' ); DECLARE @current_timestamp TIMESTAMP DEFAULT ( '2013 09 02 01:30:00.000000' ); SELECT DATEDIFF ( MINUTE, @last_notification, @current_timestamp ); END; BEGIN DECLARE @last_notification TIMESTAMP DEFAULT ( '2013 09 02 01:00:00.000000' ); DECLARE @current_timestamp TIMESTAMP DEFAULT ( '2013 09 02 01:30:00.999999' ); SELECT DATEDIFF ( MINUTE, @last_notification, @current_timestamp ); END; DATEDIFF(minute,@last_notification,@current_timestamp) 31 DATEDIFF(minute,@last_notification,@current_timestamp) 30
Example 3 is OK: #hook_dict table
This usage scores OK for two reasons:
The following sample sp_hook_dbmlsync_delay procedure illustrates the use of in/out parameters in the #hook_dict table. The procedure allows synchronization only outside a scheduled down time of the MobiLink system between 18:00 and 19:00.
CREATE PROCEDURE sp_hook_dbmlsync_delay() BEGIN DECLARE delay_val integer; SET delay_val=DATEDIFF( second, CURRENT TIME, '19:00'); IF (delay_val>0 AND delay_val<3600) THEN UPDATE #hook_dict SET value=delay_val WHERE name='delay duration'; END IF; END
- The code's more precise than the specs so an error doesn't matter; i.e., the specifications refer to hours "between 18:00 and 19:00" while the code calculates seconds "DATEDIFF( second, CURRENT TIME, '19:00')".
- ...and besides, DATEDIFF SECOND returns the number of full seconds between two timestamps rather than the number of second boundaries; i.e., it's the other DATEDIFF rarity along with with DATEDIFF HOUR.
Example 4 is OK: sp_hook_dbmlsync_abort
This one looks like a FAIL until you realize that down_time_start contains a constant value '19:00:00.000000'. That means it doesn't matter whether DATEDIFF HOUR is calculated using hours (it's not) or hour boundaries (it is), it's all the same.
The following procedure prevents synchronization during a scheduled maintenance hour between 19:00 and 20:00 each day.
CREATE PROCEDURE sp_hook_dbmlsync_abort() BEGIN DECLARE down_time_start TIME; DECLARE is_down_time VARCHAR(128); SET down_time_start='19:00'; IF datediff( hour,down_time_start,now(*) ) < 1 THEN set is_down_time='true'; ELSE SET is_down_time='false'; END IF; UPDATE #hook_dict SET value = is_down_time WHERE name = 'abort synchronization' END;
In other words, this usage is OK.
Example 5 is OK: sa_performance_diagnostics system procedure
This code's OK for the same reasons Example 3 is OK: For the connections of interest the difference between LoginTime and CURRENT TIMESTAMP grows very large very quickly compared with the size of any error introduced by DATEDIFF, and besides, DATEDIFF SECOND is calculated in seconds rather than second boundaries.
You can execute the following query to identify connections that have spent a long time waiting for database server requests to complete.
SELECT Number, Name, CAST( DATEDIFF( second, LoginTime, CURRENT TIMESTAMP ) AS DOUBLE ) AS T, IF T <> 0 THEN (ReqTimeActive / T) ELSE NULL ENDIF AS PercentActive FROM sa_performance_diagnostics() WHERE T > 0 AND PercentActive > 10.0 ORDER BY PercentActive DESC;
Example 6 is OK: sa_performance_diagnostics system procedure
Same thing again, same as Examples 3 and 5: This code is OK. A purist might use DATEDIFF MILLISECOND (Foxhound does) but with a predicate like "ReqTime > 60.0" it's not really necessary.
Find all requests that are currently executing, and have been executing for more than 60 seconds:
SELECT Number, Name, CAST( DATEDIFF( second, LastReqTime, CURRENT TIMESTAMP ) AS DOUBLE ) AS ReqTime FROM sa_performance_diagnostics() WHERE ReqStatus <> 'IDLE' AND ReqTime > 60.0 ORDER BY ReqTime DESC;
Example 7 is a FAIL: Text index refresh types
Let's ignore all the issues with this code and assume the DATEDIFF call was really written like this:
You can define your own strategy for refreshing MANUAL REFRESH text indexes. In the following example, all MANUAL REFRESH text indexes are refreshed using a refresh interval that is passed as an argument, and rules that are similar to those used for AUTO REFRESH text indexes.
CREATE PROCEDURE refresh_manual_text_indexes( refresh_interval UNSIGNED INT ) BEGIN FOR lp1 AS c1 CURSOR FOR SELECT ts.* FROM SYS.SYSTEXTIDX ti JOIN sa_text_index_stats( ) ts ON ( ts.index_id = ti.index_id ) WHERE ti.refresh_type = 1 -- manual refresh indexes only DO BEGIN IF last_refresh_utc IS null OR cast(pending_length as float) / ( IF doc_length=0 THEN NULL ELSE doc_length ENDIF) > 0.2 OR DATEDIFF( MINUTE, CURRENT UTC TIMESTAMP, last_refresh_utc ) > refresh_interval THEN EXECUTE IMMEDIATE 'REFRESH TEXT INDEX ' || text-index-name || ' ON "' || table-owner || '"."' || table-name || '"'; END IF; END; END FOR; END;
DATEDIFF( MINUTE, last_refresh, CURRENT TIMESTAMP ) > refresh_intervalThis code is a FAIL for the same reasons as the code in Example 2:
- DATEDIFF MINUTE can return two different answers for exactly the same interval depending on the number of MINUTE boundaries between the two timestamps, and
- DATEDIFF MINUTE can also return a smaller answer for a longer interval.
No comments:
Post a Comment