Friday, July 29, 2011

Using LockCount To Display UPDATE Progress

Some programming tasks are notoriously difficult to get right but are regarded as "easy" by folks who haven't actually tackled them.

One is creating a good installation setup for a complex piece of software... proof of how difficult that is lies in the vast numbers of crappy setups that have been published over the years.

Another is displaying a decent progress indicator for a long-running process that runs as a "black box" with a beginning and an end but no outside access to intermediate steps... like a set-oriented UPDATE statement that affects millions of rows.

Here is an example of where such a progress indicator is needed: when Foxhound 1.2 upgrades the data from an earlier version it has to UPDATE all the connection-level sample data to fill in some new derived columns.

In this run the UPDATE took 19 minutes to process 6 million rows and all it displayed was an occasional "Cache size adjusted" message... not good:


I. 07/26 15:27:10. 2011-07-26 15:27:10.473 DATA UPGRADE: Upgrading existing rows.
I. 07/26 15:27:41. Cache size adjusted to 3228864K
I. 07/26 15:27:47. Cache size adjusted to 3268992K
I. 07/26 15:30:03. Cache size adjusted to 3621072K
I. 07/26 15:36:48. Cache size adjusted to 3993080K
I. 07/26 15:38:48. Cache size adjusted to 4118600K
I. 07/26 15:40:04. Cache size adjusted to 4149984K
I. 07/26 15:40:09. Cache size adjusted to 4157832K
I. 07/26 15:40:14. Cache size adjusted to 4159792K
I. 07/26 15:40:19. Cache size adjusted to 4160280K
I. 07/26 15:40:24. Cache size adjusted to 4160400K
I. 07/26 15:41:53. Starting checkpoint of "new_f" (foxhound1.db) at Tue Jul 26 2011 15:41
I. 07/26 15:43:02. Finished checkpoint of "new_f" (foxhound1.db) at Tue Jul 26 2011 15:43
I. 07/26 15:46:52. 2011-07-26 15:46:52.728 DATA UPGRADE: Creating foreign keys.

Dilbert.com



Here's a look at the new improved console log display:



Here's an overview of the technique used:
  • Select SYSTAB.count to determine how many rows are being updated (the entire table),

  • watch the connection-level 'LockCount' property to determine how many rows have been updated, and

  • use an EVENT to display progress messages on the database console.
Sounds easy, doesn't it? I thought so too... stop reading now if you want to hang on to that impression.

Still there?


Here's the EVENT:

CREATE EVENT rroad_display_progress_messages
HANDLER BEGIN

DECLARE @calling_connection_id INTEGER;
DECLARE @commit_percent BIGINT;
DECLARE @continuing VARCHAR ( 1 );
DECLARE @previous_commit_message_displayed_at_loop_counter BIGINT;
DECLARE @previous_update_message_displayed_at_loop_counter BIGINT;
DECLARE @lock_count BIGINT;
DECLARE @lock_count_has_increased_at_least_once VARCHAR ( 1 );
DECLARE @loop_counter BIGINT;
DECLARE @previous_commit_percent BIGINT;
DECLARE @previous_lock_count BIGINT;
DECLARE @previous_update_percent BIGINT;
DECLARE @row_count BIGINT;
DECLARE @update_percent BIGINT;

SET @continuing = 'Y'; -- until proven otherwise

SET @calling_connection_id = CAST ( COALESCE ( EVENT_PARAMETER ( 'ConnectionID' ), '0' ) AS INTEGER );

IF @calling_connection_id <= 0 THEN
SET @continuing = 'N'; -- the calling connection id is out of range
END IF;

IF @continuing = 'Y' THEN
SET @row_count = CAST ( COALESCE ( EVENT_PARAMETER ( '@row_count_string' ), '0' ) AS BIGINT );
IF @row_count <= 0 THEN
SET @continuing = 'N'; -- the row count is out of range
END IF;
END IF;

SET @lock_count_has_increased_at_least_once = 'N';
SET @previous_update_message_displayed_at_loop_counter = 0;
SET @previous_commit_message_displayed_at_loop_counter = 0;
SET @previous_lock_count = 0;
SET @previous_update_percent = 0;
SET @previous_commit_percent = 0;
SET @loop_counter = 0;

WHILE @continuing = 'Y' LOOP

SET @loop_counter = @loop_counter + 1;

SET @lock_count = CONNECTION_PROPERTY ( 'LockCount', EVENT_PARAMETER ( 'ConnectionID' ) );

IF @lock_count IS NULL THEN
SET @continuing = 'N'; -- the calling connection no longer exist
END IF;

IF @continuing = 'Y' THEN
IF NOT EXISTS ( SELECT *
FROM rroad_progress_messages_requested
WHERE requested_by_connection_id = @calling_connection_id ) THEN
SET @continuing = 'N'; -- the caller no longer wants progress messages displayed
END IF;
END IF;

IF @continuing = 'Y' THEN

CASE

WHEN @lock_count > 0
AND @lock_count > @previous_lock_count THEN

SET @lock_count_has_increased_at_least_once = 'Y';

SET @update_percent = CAST (
( CAST ( @lock_count AS DECIMAL ( 11, 2 ) )
/ CAST ( @row_count AS DECIMAL ( 11, 2 ) ) ) * 100.00
AS BIGINT );

IF @update_percent > @previous_update_percent
OR ( @update_percent > 0
AND @update_percent = @previous_update_percent
AND @loop_counter >= @previous_update_message_displayed_at_loop_counter + 9
AND MOD ( @loop_counter, 10 ) = 0 ) THEN

MESSAGE STRING ( @update_percent, '% updated (', @lock_count, ' rows of ', @row_count, ')' ) TO CONSOLE;
SET @previous_update_message_displayed_at_loop_counter = @loop_counter;

END IF;

SET @previous_update_percent = @update_percent;

WHEN @lock_count > 0
AND @lock_count < @previous_lock_count
AND @lock_count_has_increased_at_least_once = 'Y' THEN

SET @commit_percent = CAST (
( ( CAST ( @row_count AS DECIMAL ( 11, 2 ) ) - CAST ( @lock_count AS DECIMAL ( 11, 2 ) ) )
/ CAST ( @row_count AS DECIMAL ( 11, 2 ) ) ) * 100.00
AS BIGINT );

IF @commit_percent > @previous_commit_percent
OR ( @commit_percent > 0
AND @commit_percent = @previous_commit_percent
AND @loop_counter >= @previous_commit_message_displayed_at_loop_counter + 9
AND MOD ( @loop_counter, 10 ) = 0 ) THEN

MESSAGE STRING ( @commit_percent, '% committed (', @row_count - @lock_count, ' rows of ', @row_count, ')' ) TO CONSOLE;
SET @previous_commit_message_displayed_at_loop_counter = @loop_counter;

END IF;

SET @previous_commit_percent = @commit_percent;

WHEN @lock_count = 0
AND @lock_count_has_increased_at_least_once = 'Y' THEN

MESSAGE STRING ( 'Done: ', @row_count, ' rows' ) TO CONSOLE;
SET @continuing = 'N';

ELSE
-- No action.

END CASE;

END IF;

IF @continuing = 'Y' THEN
SET @previous_lock_count = @lock_count;
WAITFOR DELAY '00:00:01';
END IF;

END LOOP;

END;
  • The SET @calling_connection_id on line 20 gets the caller's connection number so the event can ask for the lock count later on.

  • The SET @row_count on line 27 call the magic EVENT_PARAMETER function to retrieve the parameter that was passed from the caller via TRIGGER EVENT.

  • The loop starting on line 41 runs forever, or until some IF THEN SET @continuing = 'N' decides it is time to stop.

  • The SET @lock_count on line 45 gets the LockCount property for the calling connection (oops, it should say "@calling_connection_id" instead of "EVENT_PARAMETER ( 'ConnectionID' )"... that's been fixed but this is the code that produced the test results below so here it is, warts and all).

  • The IF THEN SET @continuing = 'N' on lines 52 through 56 checks to see if the caller no longer needs the services of the event, thank you very much... the caller communicates that fact by deleting its row in rroad_progress_messages_requested.

  • The first CASE WHEN on lines 63 through 84 handles MESSAGE output while the caller's UPDATE is gathering locks.

  • The IF statement on lines 73 through 82 produces a MESSAGE if
    • the update percent has increased since the previous loop pass, or

    • the update percent has remained stuck at the same non-zero value for several passes through the loop.

  • The second CASE WHEN on lines 86 through 106 handles MESSAGE output while the caller's COMMIT is releasing locks.

  • The AND @lock_count_has_increased_at_least_once = 'Y' on line 88 makes sure this code doesn't run before the first UPDATE starts... that's not likely, but you never know. OK, sure, this might be case of Cargo Cult Programming, I'm not too proud to consider the possibility.

  • The third CASE WHEN on lines 108 through 112 probably is an example of Cargo Cult Programming because this event ain't never gonna see the lock count hit zero before the caller kills the whole deal by deleting its row in rroad_progress_messages_requested (see the IF THEN SET @continuing = 'N' on lines 52 through 56).

  • The WAITFOR DELAY on line 123 slows the whole thing down so you don't get millions of progress messages.
Why an EVENT? So the WHILE loop runs as a separate asynchronous process, independent from and at the same time as the caller's UPDATE and COMMIT... I think they call it multiprogramming or multitasking or something like that.

Here's an excerpt from the code that triggers the event:

DECLARE @rroad_group_2_property_pivot_row_count BIGINT;
. . .

------------------------------------------------------------
-- Start displaying progress messages.

INSERT rroad_progress_messages_requested ( requested_by_connection_id ) VALUES ( @@SPID );

COMMIT;

SELECT SYSTAB.count
INTO @rroad_group_2_property_pivot_row_count
FROM SYSTAB
WHERE SYSTAB.table_name = 'rroad_group_2_property_pivot';

TRIGGER EVENT rroad_display_progress_messages (
@row_count_string = STRING ( @rroad_group_2_property_pivot_row_count ) );

------------------------------------------------------------
-- Update rroad_group_2_property_pivot with remaining calculated columns

UPDATE rroad_group_2_property_pivot
INNER JOIN rroad_sample_set
ON rroad_sample_set.sample_set_number = rroad_group_2_property_pivot.sample_set_number
LEFT OUTER JOIN ( SELECT rroad_group_2_property_pivot.sample_set_number AS sample_set_number,
. . .

COMMIT;

------------------------------------------------------------
-- Stop displaying progress messages.

DELETE rroad_progress_messages_requested
WHERE requested_by_connection_id = @@spid;

COMMIT;
  • The INSERT on line 7 allows the EVENT to produce messages when it runs.

  • The SELECT on lines 11 through 14 works perfectly if there has been a CHECKPOINT recently; if not, it's probably good enough, and in this case it's just fine.

  • The TRIGGER EVENT on lines 16 and 17 is where the magic lies: it defines a pseudo-parameter called @row_count_string and gives it a value that is available via the (also magic) EVENT_PARAMETER function shown earlier.

  • The UPDATE and COMMIT on lines 22 through 28 are the "black box" process that runs for 19 minutes.

  • The DELETE starting on line 33 tells the EVENT to stop what it's doing, no more messages are required.
Here's the table that is used to communicate with the event:

CREATE TABLE rroad_progress_messages_requested (
requested_by_connection_id INTEGER NOT NULL PRIMARY KEY );

Here's a screen capture showing how processing slowed down near the end for some unknown reason:
  • the updated percentage got stuck (98%, 98%, 98%, ...), and

  • the COMMIT ran long enough for several committed percentages to appear (10%, 31%, 54%, ...).



Here what the database console log contained for another test; this one didn't slow down as it neared completion, and the COMMIT was faster (only a single "70% committed" message):

I. 07/28 11:14:36. 2011-07-28 11:14:36.180 DATA UPGRADE: Updating Foxhound Monitor connection-level sample rows.
I. 07/28 11:14:44. Cache size adjusted to 3648208K
I. 07/28 11:15:00. Cache size adjusted to 3773872K
I. 07/28 11:15:05. Cache size adjusted to 3814120K
I. 07/28 11:15:09. Cache size adjusted to 3828376K
I. 07/28 11:15:11. Cache size adjusted to 3842344K
I. 07/28 11:15:21. Cache size adjusted to 3849032K
I. 07/28 11:15:26. Cache size adjusted to 3851152K
I. 07/28 11:15:27. 1% updated (67219 rows of 6133762)
I. 07/28 11:15:31. Cache size adjusted to 3851888K
I. 07/28 11:15:36. 2% updated (127610 rows of 6133762)
I. 07/28 11:15:36. Cache size adjusted to 3852120K
I. 07/28 11:15:41. Cache size adjusted to 3851928K
I. 07/28 11:15:45. 3% updated (185549 rows of 6133762)
I. 07/28 11:15:47. Cache size adjusted to 3863232K
I. 07/28 11:15:52. Cache size adjusted to 3866080K
I. 07/28 11:15:55. 4% updated (250198 rows of 6133762)
I. 07/28 11:16:03. 5% updated (309751 rows of 6133762)
I. 07/28 11:16:12. 6% updated (368423 rows of 6133762)
I. 07/28 11:16:24. 7% updated (434508 rows of 6133762)
I. 07/28 11:16:33. 8% updated (492776 rows of 6133762)
I. 07/28 11:16:43. 9% updated (555023 rows of 6133762)
I. 07/28 11:16:52. Cache size adjusted to 3909880K
I. 07/28 11:16:54. 10% updated (618356 rows of 6133762)
I. 07/28 11:17:03. 11% updated (679788 rows of 6133762)
I. 07/28 11:17:12. 12% updated (740038 rows of 6133762)
I. 07/28 11:17:15. Cache size adjusted to 3924104K
I. 07/28 11:17:20. Cache size adjusted to 3928528K
I. 07/28 11:17:22. 13% updated (799102 rows of 6133762)
I. 07/28 11:17:25. Cache size adjusted to 3931472K
I. 07/28 11:17:30. Cache size adjusted to 3933392K
I. 07/28 11:17:32. 14% updated (862614 rows of 6133762)
I. 07/28 11:17:35. Cache size adjusted to 3935112K
I. 07/28 11:17:40. Cache size adjusted to 3941088K
I. 07/28 11:17:42. 15% updated (924157 rows of 6133762)
I. 07/28 11:17:45. Cache size adjusted to 3943656K
I. 07/28 11:17:50. Cache size adjusted to 3948168K
I. 07/28 11:17:51. 16% updated (985534 rows of 6133762)
I. 07/28 11:18:01. 17% updated (1047734 rows of 6133762)
I. 07/28 11:18:09. 18% updated (1105532 rows of 6133762)
I. 07/28 11:18:18. 19% updated (1170862 rows of 6133762)
I. 07/28 11:18:25. 20% updated (1227940 rows of 6133762)
I. 07/28 11:18:34. 21% updated (1296232 rows of 6133762)
I. 07/28 11:18:42. 22% updated (1351509 rows of 6133762)
I. 07/28 11:18:50. 23% updated (1419113 rows of 6133762)
I. 07/28 11:18:50. Cache size adjusted to 3953328K
I. 07/28 11:18:58. 24% updated (1476474 rows of 6133762)
I. 07/28 11:19:07. 25% updated (1540502 rows of 6133762)
I. 07/28 11:19:19. 25% updated (1578351 rows of 6133762)
I. 07/28 11:19:29. 26% updated (1595529 rows of 6133762)
I. 07/28 11:19:41. 27% updated (1662896 rows of 6133762)
I. 07/28 11:19:50. Cache size adjusted to 3934168K
I. 07/28 11:19:51. 28% updated (1722376 rows of 6133762)
I. 07/28 11:20:01. 29% updated (1783204 rows of 6133762)
I. 07/28 11:20:13. 30% updated (1844001 rows of 6133762)
I. 07/28 11:20:23. 31% updated (1905747 rows of 6133762)
I. 07/28 11:20:33. 32% updated (1968430 rows of 6133762)
I. 07/28 11:20:46. 33% updated (2025920 rows of 6133762)
I. 07/28 11:20:51. Cache size adjusted to 3921488K
I. 07/28 11:20:57. 34% updated (2087546 rows of 6133762)
I. 07/28 11:21:08. 35% updated (2147983 rows of 6133762)
I. 07/28 11:21:18. 36% updated (2210031 rows of 6133762)
I. 07/28 11:21:28. 37% updated (2272765 rows of 6133762)
I. 07/28 11:21:38. 38% updated (2334295 rows of 6133762)
I. 07/28 11:21:47. 39% updated (2392202 rows of 6133762)
I. 07/28 11:21:51. Cache size adjusted to 3917520K
I. 07/28 11:22:01. 40% updated (2458832 rows of 6133762)
I. 07/28 11:22:12. 41% updated (2519676 rows of 6133762)
I. 07/28 11:22:17. Cache size adjusted to 3918608K
I. 07/28 11:22:22. Cache size adjusted to 3919096K
I. 07/28 11:22:25. 41% updated (2574924 rows of 6133762)
I. 07/28 11:22:26. 42% updated (2580107 rows of 6133762)
I. 07/28 11:22:27. Cache size adjusted to 3914408K
I. 07/28 11:22:32. Cache size adjusted to 3912736K
I. 07/28 11:22:35. 42% updated (2629137 rows of 6133762)
I. 07/28 11:22:37. Cache size adjusted to 3916120K
I. 07/28 11:22:38. 43% updated (2638896 rows of 6133762)
I. 07/28 11:22:42. Cache size adjusted to 3923840K
I. 07/28 11:22:47. Cache size adjusted to 3928120K
I. 07/28 11:22:51. 44% updated (2704091 rows of 6133762)
I. 07/28 11:22:52. Cache size adjusted to 3930848K
I. 07/28 11:23:01. 45% updated (2760737 rows of 6133762)
I. 07/28 11:23:10. 46% updated (2822995 rows of 6133762)
I. 07/28 11:23:21. 47% updated (2883865 rows of 6133762)
I. 07/28 11:23:32. 48% updated (2949216 rows of 6133762)
I. 07/28 11:23:43. 49% updated (3006036 rows of 6133762)
I. 07/28 11:23:55. 50% updated (3073329 rows of 6133762)
I. 07/28 11:24:05. 51% updated (3130900 rows of 6133762)
I. 07/28 11:24:16. 52% updated (3190512 rows of 6133762)
I. 07/28 11:24:27. 53% updated (3255993 rows of 6133762)
I. 07/28 11:24:37. 54% updated (3317650 rows of 6133762)
I. 07/28 11:24:45. 55% updated (3373726 rows of 6133762)
I. 07/28 11:24:56. 56% updated (3441046 rows of 6133762)
I. 07/28 11:25:06. 57% updated (3499829 rows of 6133762)
I. 07/28 11:25:17. 58% updated (3561681 rows of 6133762)
I. 07/28 11:25:22. Cache size adjusted to 4039144K
I. 07/28 11:25:27. 59% updated (3619535 rows of 6133762)
I. 07/28 11:25:39. 60% updated (3683934 rows of 6133762)
I. 07/28 11:25:52. 61% updated (3747029 rows of 6133762)
I. 07/28 11:26:03. 62% updated (3809460 rows of 6133762)
I. 07/28 11:26:11. 63% updated (3864864 rows of 6133762)
I. 07/28 11:26:19. 64% updated (3929201 rows of 6133762)
I. 07/28 11:26:22. Cache size adjusted to 4097008K
I. 07/28 11:26:28. 65% updated (3987744 rows of 6133762)
I. 07/28 11:26:34. Starting checkpoint of "new_f" (foxhound1.db) at Thu Jul 28 2011 11:26
I. 07/28 11:27:22. Cache size adjusted to 4144584K
I. 07/28 11:28:26. Finished checkpoint of "new_f" (foxhound1.db) at Thu Jul 28 2011 11:28
I. 07/28 11:28:32. 66% updated (4049736 rows of 6133762)
I. 07/28 11:28:41. 67% updated (4111378 rows of 6133762)
I. 07/28 11:28:50. 68% updated (4171434 rows of 6133762)
I. 07/28 11:28:58. 69% updated (4233962 rows of 6133762)
I. 07/28 11:29:07. 70% updated (4297422 rows of 6133762)
I. 07/28 11:29:16. 71% updated (4359568 rows of 6133762)
I. 07/28 11:29:22. Cache size adjusted to 4156480K
I. 07/28 11:29:24. 72% updated (4416568 rows of 6133762)
I. 07/28 11:29:34. 73% updated (4480383 rows of 6133762)
I. 07/28 11:29:43. 74% updated (4543735 rows of 6133762)
I. 07/28 11:29:52. 75% updated (4603354 rows of 6133762)
I. 07/28 11:30:01. 76% updated (4669403 rows of 6133762)
I. 07/28 11:30:08. 77% updated (4726151 rows of 6133762)
I. 07/28 11:30:14. Starting checkpoint of "new_f" (foxhound1.db) at Thu Jul 28 2011 11:30
I. 07/28 11:30:22. Cache size adjusted to 4159456K
I. 07/28 11:30:23. Finished checkpoint of "new_f" (foxhound1.db) at Thu Jul 28 2011 11:30
I. 07/28 11:30:26. 78% updated (4791562 rows of 6133762)
I. 07/28 11:30:34. 79% updated (4848418 rows of 6133762)
I. 07/28 11:30:39. Cache size adjusted to 4160200K
I. 07/28 11:30:44. 80% updated (4910819 rows of 6133762)
I. 07/28 11:30:44. Cache size adjusted to 4160384K
I. 07/28 11:30:52. 81% updated (4968486 rows of 6133762)
I. 07/28 11:31:01. 82% updated (5034257 rows of 6133762)
I. 07/28 11:31:10. 83% updated (5098073 rows of 6133762)
I. 07/28 11:31:18. 84% updated (5156062 rows of 6133762)
I. 07/28 11:31:26. 85% updated (5218284 rows of 6133762)
I. 07/28 11:31:34. 86% updated (5275652 rows of 6133762)
I. 07/28 11:31:44. 87% updated (5337384 rows of 6133762)
I. 07/28 11:31:52. 88% updated (5403349 rows of 6133762)
I. 07/28 11:32:00. 89% updated (5460098 rows of 6133762)
I. 07/28 11:32:09. 90% updated (5526121 rows of 6133762)
I. 07/28 11:32:17. 91% updated (5585565 rows of 6133762)
I. 07/28 11:32:24. 92% updated (5643851 rows of 6133762)
I. 07/28 11:32:33. 93% updated (5708012 rows of 6133762)
I. 07/28 11:32:42. 94% updated (5766534 rows of 6133762)
I. 07/28 11:32:50. 95% updated (5829575 rows of 6133762)
I. 07/28 11:32:59. 96% updated (5891464 rows of 6133762)
I. 07/28 11:33:08. 97% updated (5952045 rows of 6133762)
I. 07/28 11:33:18. 98% updated (6018117 rows of 6133762)
I. 07/28 11:33:26. 99% updated (6073883 rows of 6133762)
I. 07/28 11:33:37. 100% updated (6133762 rows of 6133762)
I. 07/28 11:33:47. 70% committed (4313700 rows of 6133762)
I. 07/28 11:33:48. 2011-07-28 11:33:48.717 DATA UPGRADE: Creating foreign keys.

Did the EVENT overhead slow things down? Nope, the last test took 19 minutes just like the first one... all it did was give the customer something to look at.


No comments: