Wednesday, February 1, 2012

Coding Good Comments: Explanations

The code comes first: If there's a choice to be made between writing clear, understandable program code and writing comments to explain less-than-readable code, comments should take a back seat.

The reason is simple: It's the code that determines what the program does, not the comments. Comments aren't even compiled, let alone executed. So it shouldn't come as a surprise that maintenance programmers concentrate on the code, often skipping over the comments altogether.

Sadly, however, even readable, understandable program code tells only part of the story: It only answers the question, "How is this program accomplishing its tasks?" Very few executable statements directly address the question, "What is this program doing?", with the possible exception of a CALL statement where the procedure name has been carefully crafted to answer the "What?"

There are no exceptions, however, when it comes to answering the questions, "Why is the program doing this thing?" or the much harder "Why isn't the program doing this other thing?"

So, comments come second, but

Explanations Are Important

Explanation: A concise comment describing the "What" and "Why" of a section of program code while avoiding the "How" which should be apparent from reading the code itself.

Example 1

Here is a short section of simple code (basically a SELECT FROM DUMMY) with several explanations, some good, some not so good:
      -- Perform the basic heartbeat or canarian query.
      -- This is done to calculate the heartbeat time as well check the connection.
      -- Note: The target database can be stopped and restarted, and the monitor should just keep trying to connect.
      -- Note: When Foxhound itself is is stopped and restarted, reconnection to target databases *should* be automatic.

      SET @sql = STRING ( 'SELECT dummy_col INTO @dummy_col FROM ', @proxy_owner, '.proxy_DUMMY' );

      SET @canarian_query_started_at  = CURRENT TIMESTAMP;
      SET @canarian_query_finished_at = @canarian_query_started_at;


      ROLLBACK; -- release locks, recover from possible failed proxy connection

      SET @connected_ok = 'Y';
      SET @canarian_query_finished_at = CURRENT TIMESTAMP;

         ' 204-4 Monitor ', @sampling_id, ': First heartbeat OK' ) TO CONSOLE DEBUG ONLY;

      -- Note: Because the proxy_DUMMY table is the last proxy object created, a 
      -- successful query against proxy_DUMMY may taken as proof that a connection exists
      -- and is usable. This is important because a previous connection attempt
      -- may be dropped (timed out) after it created some but not all necessary proxy objects.
      -- Such a partly-established connection must be treated as no connection at all, and
      -- that is guaranteed by creating proxy_DUMMY last.

In the comment "This is done..." the phrase "as well as check the connection" provides extra information about the "What?" of this code, so that's OK as an explanation.

The phrase "calculate the heartbeat time" is a bit redundant, however... not wrong, but not really necessary given the preceding section title "Perform the basic heartbeat or canarian query."

The two one-line "Note:" comments are a bit other-worldly because they state true facts that have very little (nothing?) to do with nearby code. These score zero on the Comment Value Scale; not a positive score because they offer no real value, but not a negative score because at least they aren't wrong.

The long "Note: Because the proxy_DUMMY table is ..." paragraph scores very high, however, because it explains a very important "Why?" for this otherwise trivial-looking section of code.

Example 2

This example shows two separate explanations that were added to the code at two separate times.

The later paragraph "It is possible..." came first, chronologically speaking, to explain why the unusual step of calling GET_IDENTITY was performed instead of just letting the later INSERT assign the value... so far, so good, the original "Why?" is covered.
-- Calculate candidate primary key for successful sample.

-- Note: The following discussion is moot, but @candidate_sample_set_number is still
--       assigned because later code refers to it.

-- It is possible for this event to take some time before inserting a row
-- in rroad_sample_set. During that time, it is also possible for sampling to be 
-- stopped, and a "sampling stopped" row to be inserted. The row inserted by this
-- event should have a lower primary key than the "sampling stopped" event, so it
-- cannot wait until the actual INSERT rroad_sample_set statement to calculate the
-- value of the autoincrement column sample_set_number. The following statement 
-- pre-assigns a value which will only be used for the INSERT of a "successful" 
-- sample set; if that INSERT is not performed because of some problem, there will
-- be a gap in the primary key sequence.

SET @candidate_sample_set_number = GET_IDENTITY ( 'rroad_sample_set' );

The second explanation, the previous "Note: ...", was added later to explain that GET_IDENTITY was being called through laziness rather than real need. This explanation isn't quite as thorough, but it does serve to prevent the new maintenance programmer from searching the code to find the reason GET_IDENTITY was called.

In a perfect world, the call to GET_IDENTITY would have been removed as soon as it became unnecessary. In an imperfect world, it's easier to add the "Note:" when the call became moot... not just because it's less work to write the comment than change the code, but also because comments aren't compiled so the "Note:" doesn't have to be tested.

Example 3

The following example almost got left out because at first glance it looked like just another bad explanation: Who needs to be told that sa_locks needs DBA authority?
   -- Create proxy_sa_locks if necessary.
   -- This is only done after a full connection process.
   -- Note: DBA authority is needed to call sa_locks.

Well, with this application (Foxhound) whether the user id has DBA authority or not controls on how much detail the user's going to get, and this comment explains to the maintenance programmer that information about row locks is going to be missing from that detail if the user doesn't have DBA authority.

So, this "Note: DBA authority is needed to call sa_locks" looks like a "How?" but it's really a "What?", albeit on a grander scale: "What does the maintenance programmer need to know about this code?"

Example 4

Here is a breathtakingly-verbose "Note:" that explains a very simple-looking IF NOT EXISTS RETURN control structure:
-- Step 5 - Check to see if sampling has been turned off. 

IF  @ok = 'Y'  
AND @connected_ok = 'Y' THEN 

   -- Note: This code is an attempt to prevent a successful sample from appearing between 
   --       successive "Sampling stopped at" and "Sampling stopped" lines in the Monitor display.
   --       This process may already be executing the code above this point when sampling is stopped by
   --       a different process, and the @sample_started_at used by this process is calculated in the
   --       next section... which is after the the "Sampling stopped at" point calculated by the other
   --       process but before the eventual "Sampling stopped" point which will be calculated by some
   --       (probably much later) process.
   --       In the following example, the sample at Apr 22 5:12:21 PM should not have been recorded:
   --          Apr 22 5:13:35 PM   1m 14.2s    0s / .1s ...
   --          Apr 22 5:13:21 PM   1m 18.8s   -- Sampling stopped --  
   --          Apr 22 5:12:21 PM   0s    0s / 0s    1 ...
   --          Apr 22 5:12:02 PM   1m 24.2s   -- Sampling stopped at --  
   --          Apr 22 5:10:38 PM   1m 10.1s    .1s / .1s ...
   --       In the following example, the sample at 5:11:23 PM should not have been recorded:
   --          5:11:32 PM   8.5s    .1s / 0s ...
   --          5:11:27 PM    ALL CLEAR - Alert # 1: Foxhound has been unable to gather samples for 1m or longer. 
   --          5:11:25 PM   4m 54.7s   -- Sampling stopped --  
   --          5:11:23 PM   0s    0s / .1s ...
   --          5:06:30 PM   0s   -- Sampling stopped at --  
   --          5:06:30 PM   39.6s   -- Foxhound stopped --  
   --          5:05:50 PM   0s   -- Foxhound stopped at --  

                     FROM rroad_sampling_options
                    WHERE rroad_sampling_options.sampling_id                = @sampling_id
                      AND rroad_sampling_options.sampling_should_be_running = 'Y' ) THEN

      RETURN; -- do not record this sample

   END IF;

Here's the reason for the long explanation: The bug very hard to find, and the fix so simple that no future maintenance programmer, not even the origial author, would understand the "Why?" of the IF-NOT-EXISTS-RETURN code without some help.

In fact, without the explanation, it's not clear that the code is even correct let alone necessary. Comments don't just save time, they can prevent mistakes (like removing a piece of code because it looks wrong.)

In this case the comment itself wasn't that hard to write: the sample data was gathered as part of the diagnostic effort, and the explanation was worked out carefully and even written down as part of that same effort... it took very little extra effort to copy the explantion into the code as an aid to future developers.

The Bottom Line

Explanations belong inside the code, not locked in the fading memory of the original author or lost inside some other document no developer will ever read.

That is, if you think explanations are important.

Folks who don't think explanations are important generally don't think readability's important either. That's why most code without comments is also unreadable... not because there are no comments, but because the author cares more about abstract measurements of "elegance" or believes that somehow readability contradicts correctness.

1 comment:

Usemeplz said...

Great article, but for me as for beginner it is very difficult at the monent)))