Monday, July 29, 2013

Should Examples Work?

More specifically, is it important that code examples actually work or is it sufficient that they give a rough idea of what the code should look like? Like a stick figure gives a rough idea of what a person looks like, as opposed to a photograph?

The interweb is so ... chock ... full ... of examples that don't work it seems the answer must be "no, examples don't have to work." Testing is expensive, Microsoft Word doesn't complain when the examples don't work, and nobody else seems to care, so why bother?

Why take the time?

Why spend the money?


Let's take a look close to home, the Text index refresh types Help topic:
MANUAL REFRESH MANUAL REFRESH text indexes are refreshed only when you refresh them, and are recommended if data in the underlying table is rarely changed, or if a greater degree of data staleness is acceptable, or to refresh after an event or a condition is met. A query on a stale index returns matching rows that have not been changed since the last refresh. So, rows that have been inserted, deleted, or updated since the last refresh are not returned by a query.

You can define your own strategy for refreshing MANUAL REFRESH text indexes. In the following example, all MANUAL REFRESH text indexes are refreshed using a refresh interval that is passed as an argument, and rules that are similar to those used for AUTO REFRESH text indexes.
CREATE PROCEDURE refresh_manual_text_indexes( 
   refresh_interval UNSIGNED INT )
BEGIN
 FOR lp1 AS c1 CURSOR FOR
   SELECT ts.*
   FROM SYS.SYSTEXTIDX ti JOIN sa_text_index_stats( ) ts
   ON ( ts.index_id = ti.index_id )
   WHERE ti.refresh_type = 1 -- manual refresh indexes only
 DO
   BEGIN
    IF last_refresh_utc IS null 
    OR cast(pending_length as float) / (
       IF doc_length=0 THEN NULL ELSE doc_length ENDIF) > 0.2
    OR DATEDIFF( MINUTE, CURRENT UTC TIMESTAMP, last_refresh_utc )
       > refresh_interval THEN
     EXECUTE IMMEDIATE 'REFRESH TEXT INDEX ' || text-index-name || ' ON "'
     || table-owner || '"."' || table-name || '"';
    END IF;
   END;
  END FOR;
END;
At any time, you can use the sa_text_index_stats system procedure to decide if a refresh is needed, and whether the refresh should be a complete rebuild or an incremental update.

A MANUAL REFRESH text index contains no data at creation time, and is not available for use until you refresh it. To refresh a MANUAL REFRESH text index, use the REFRESH TEXT INDEX statement.

MANUAL REFRESH text indexes are not refreshed during a reload unless the -g option is specified for dbunload.
Never mind the SELECT ts.*, which might be "wrong" because it increases the burden on the reader without adding any value... that's not the issue here.

The issue is, does the example serve any purpose? Does it help the reader code a stored procedure for refreshing text indexes, more than just reading the surrounding text helps?

You decide

Maybe your answer is "Yes, it's fine, I'm not going to copy and paste, it's pointing me in the right direction, it tells me what I need to know."

Maybe it's not important that
  • the procedure doesn't compile because of "text-index-name",

  • there's no column in ts.* named last_refresh_utc (it's just last_refresh),

  • the DATEDIFF will never be > refresh_interval because

    • CURRENT UTC TIMESTAMP will always be greater than sa_text_index_stats.last_refresh

    • so the DATEDIFF will always return a negative number, and

    • refresh_interval will never be negative because it's an UNSIGNED INT and

  • DATEDIFF counts minute boundaries rather than complete minutes, which may or may not be acceptable in actual use.
Except for the bit about "doesn't compile" those points were discovered by inspection rather than testing... are there other issues that testing might reveal?

Maybe it matters, maybe not... you tell me.


5 comments:

John Smirnios said...

in a v16 database, there is a last_refresh_utc column.

Breck Carter said...

nuh uh... but I will resist the temptation to make further comment about "testing" :)...

@@VERSION
'16.0.0.1512'

Could not execute statement.
Column 'last_refresh_utc' not found
SQLCODE=-143, ODBC 3 State="42S22"
Line 2, column 1

SELECT last_refresh_utc FROM sa_text_index_stats()

John Smirnios said...

The column is not in sa_text_index_stats(), it is in systextidx:

select * from sa_describe_query( 'select * from systextidx' );
output to d:\tmp\systextidx.txt with column names

'column_number','name','domain_id','domain_name','domain_name_with_size','width','scale','declared_width','user_type_id','user_type_name','correlation_name','base_table_id','base_column_id','base_owner_name','base_table_name','base_column_name','nulls_allowed','is_autoincrement','is_key_column','is_added_key_column'
1,'index_id',23,'unsigned bigint','unsigned bigint',8,0,8,,,'txt',69,1,'SYS','ISYSTEXTIDX','index_id',0,0,,
2,'sequence',21,'unsigned int','unsigned int',4,0,4,,,'txt',69,2,'SYS','ISYSTEXTIDX','sequence',0,0,,
3,'status',21,'unsigned int','unsigned int',4,0,4,,,'txt',69,3,'SYS','ISYSTEXTIDX','status',0,0,,
4,'text_config',23,'unsigned bigint','unsigned bigint',8,0,8,,,'txt',69,4,'SYS','ISYSTEXTIDX','text_config',1,0,,
5,'next_handle',21,'unsigned int','unsigned int',4,0,4,,,'txt',69,5,'SYS','ISYSTEXTIDX','next_handle',0,0,,
6,'last_handle',21,'unsigned int','unsigned int',4,0,4,,,'txt',69,6,'SYS','ISYSTEXTIDX','last_handle',0,0,,
7,'deleted_length',23,'unsigned bigint','unsigned bigint',8,0,8,,,'txt',69,7,'SYS','ISYSTEXTIDX','deleted_length',0,0,,
8,'pending_length',23,'unsigned bigint','unsigned bigint',8,0,8,,,'txt',69,8,'SYS','ISYSTEXTIDX','pending_length',0,0,,
9,'refresh_type',19,'tinyint','tinyint',1,0,1,,,'txt',69,9,'SYS','ISYSTEXTIDX','refresh_type',0,0,,
10,'refresh_interval',21,'unsigned int','unsigned int',4,0,4,,,'txt',69,10,'SYS','ISYSTEXTIDX','refresh_interval',1,0,,
11,'last_refresh',13,'timestamp','timestamp',23,0,8,,,,,,,,,1,,,
12,'last_refresh_utc',26,'timestamp with time zone','timestamp with time zone',29,0,0,,,,,,,,,1,,,

John Smirnios said...

So, the original SELECT ts.* is probably wrong in that it manages to exclude the last_refresh_utc column from the result set. "SELECT *" may have been better but then I would need to test it :)

Perhaps the sa_text_index_stats procedure should also be changed to return last_refresh_utc.

Breck Carter said...

Thank you for playing "Gotcha!" [snork]

signed, Breck Grinning Running And Ducking :)