Wednesday, August 31, 2011

The Story of a Bug

The 7 different "Save" buttons on the main Foxhound Options page didn't actually "save all the changes" like it now says in the Help for Foxhound 1.2, they just saved the values in each of the 7 local sections of the page.

But wait, it gets worse

The separate Monitor Options page has 6 different "Save" buttons, and they did in fact "save all the changes"... and the developer was the same.

Much worse

If this was a movie, the scenes above would be one of those annoying "flash forwards"; here's how the story starts:

Following the practice described in Docs Finds Bugs the developer made the following change to the Help to make the descriptions of all 13 Save buttons complete and consistent across both pages, from this

to this

but did not, in fact, actually check both options pages to make sure they agreed with the Help... just the one that did agree (of course!)

Much, much worse

How much worse can it get? Try this: The earlier version of the Help for the Foxhound Options page was explicit about how each Save button saved only the options local to each section of the page... it agreed with the code, and the change to the Help required a corresponding change to the code, which did ... not ... get ... done ... by the person who changed the Help, who was the same person responsible for the code.

Discovery! Redemption!

Later in the "Fix Bugs" step which followed the initial "Update the Help", the developer happened to notice the discrepancy between the way the Save buttons worked on the two pages.

Of course, if the developer hadn't changed the docs, they wouldn't have been changed at all; technical writers don't typically initiate spec changes.

So... was it actually a bug?

Welcome to Moot Point!

No, it wasn't a bug at first, it was an enhancement. It became a bug when the enhancement described in the Help wasn't implemented in the code.

Oops. Which is Management Speak for "moot".

Would this bug have been caught during Acceptance Testing? Maybe, maybe not... but Acceptance Testing is not supposed to be a "bug hunt". When a bug is discovered during Acceptance Testing the whole boring ... endless ... process ... pretty much has to start over again.

What about the specs? Was this bug a violation of the specifications? Sure... the specs said "make the Foxhound options pages work like Google Settings pages, only better"... there's nothing like a [cough] impossible challenging goal to spur innovation!
For what it's worth, separate Save buttons exist in Foxhound so the user
  • can find all the options in one place rather than dealing with labyrinthine dialogs and wizards

  • but doesn't have to scroll down three miles to get to the steenking Save button

  • or worse, doesn't forget to click Save, like what happens now with Google Settings since they removed the duplicate Save buttons.
But that's not the point. The point is that the enhancement got made and the bug got found as a direct result of the docs: Docs Finds Bugs.

Monday, August 29, 2011

Docs Finds Bugs

Writing the documentation for an application is a bug-hunting activity. Why? Because the act of describing how to use an application inevitably uncovers errors inconsistencies between desired and actual behaviors.

In other words, an endless series of ...


... moments, as in "WTF did it just do?"

That's why so much time is spent writing the docs for Foxhound:
  • first, the Help that is built into the program as well as being available on the web, and

  • then the Frequently Asked Questions which are only available on the web.
That's why, when a new release of Foxhound is feature-frozen (as Foxhound 1.2 is now), and the programming staff has declared their work 100% complete (because, presumably, they have finished unit testing, which is another way of saying everything compiles cleanly), the next step is not "Acceptance Testing".

Rather, the next step is ...

Update The Help

... which is inevitably followed by "Fix Bugs".

Then comes "Update The FAQ".

Followed by (you guessed it) "Fix Bugs" again.

Only after all the docs are up to date, and all known discrepancies between the docs and Foxhound's actual behavior have been eliminated, does the endless, boring task of acceptance testing begin.

Who should write the docs?

The developers, of course, because they know what the application actually does (the code) as opposed to what someone once asked for (the specs)... but in many shops developers are the least likely candidates for the task.

Stereotypical developers claim that writing documentation is beneath them, that it is a waste of their valuable time, when in fact they don't do it because they hate it, and they hate it because they suck at it.

And they suck at it because they don't know how to write anything longer than a shopping list or more coherent than a tweet.

So what do companies do?

Instead of teaching developers how to write (or hiring ones who can) and then requiring them to describe their own dog food, stereotypical companies turn the entire task over to professional writers who work apart from the developers.

These writers do a wonderful job of creating docs that are clear, well-written and riddled with errors descriptive of applications which simply ... do ... not ... exist.

What should companies do?

At the very least, professional developers should learn to write at a level that can be understood by the professional writers.

Then, the developers should be the ones to write the draft documentation for everything new, and to write draft updates for everything that has changed... with the emphasis on "everything".

Those tasks should be part of the developers' job descriptions, as should the acts of finding and fixing the bugs freshly-discovered differences between the desired application and the actual application.

The result is a better application and better docs.

What are unintended consequences?

The phrase "unintended consequence" usually denotes something bad, as in "slaughter might be an unintended consequence of unilateral disarmament."

In the case of "developers writing docs" one unintended consequence is a good thing: No more GUI monstrosities.

To explain how to use a GUI, developers actually have to use the GUI, and being the smart people they are, if they actually use the GUI chances are they'll see what's wrong with it and fix it.

Sadly, however, stereotypical companies don't even document their GUIs, let alone get the developers to do it. They just wave their arms and flap about...

"It's a wizard, what's to document?"

So we get GUI monstrosities, like [insert your favorite example here] and [insert another favorite here].

And that's the bad kind of unintended consequence, that's what happens when you don't get the developers to write the docs.

Friday, August 26, 2011

The fRiDaY File - Visit us on Twitter!

I'm not sure how you "visit" anyone on Twitter... but, then again, I'm an old guy, hard of learning, not sure what to make of it all:

Maybe, however, if I try harder...

...I can become a twit too:

Latest SQL Anywhere EBFs: 11.0.1 and 12.0.1 Windows, 11.0.1 HP-UX

The three asterisks "***" indicate what's new since August 12..

Only EBFs for the three fully-supported versions of SQL Anywhere are shown here: 10.0.1, 11.0.1 and 12.0.1.

Current builds...

HP-UX Itanium EBF ***             24 Aug 2011 ***
        EBF                 14 Oct 2008

HP-UX PA-RISC EBF                 16 Oct 2008

IBM AIX          12.0.1 GA Upgrade from 12.0.0   15 Mar 2011
        EBF                 26 Jul 2011

Linux Itanium EBF                 14 Oct 2008

Linux x86/x64 EBF                 01 Apr 2011
        EBF                 29 Jul 2011
        EBF                 17 Jun 2011

Mac OS  EBF                 26 May 2011
        EBF                 29 Jun 2010
        EBF                 01 Oct 2010

NetWare EBF                 10 Mar 2010

Solaris SPARC    12.0.1 GA Upgrade from 12.0.0   15 Mar 2011
        EBF                 07 Jul 2011
        EBF                 17 Jun 2011

Solaris x64 EBF                 05 Aug 2011
        EBF                 01 Apr 2009

Windows CE EBF                 09 Mar 2010

Windows Itanium  10.0.1 GA Upgrade from 10.0.0   26 Apr 2007

Windows x86/x64 EBF ***             24 Aug 2011 ***
        EBF ***             22 Aug 2011 ***
        EBF                 17 Jun 2011

Thursday, August 25, 2011

The Thursday Quote - Mordechai (Moti) Ben-Ari

"If Tiffany wants to become a professional singer, I would not try to dissuade her, but I would prefer that she pursue a CS degree part time while she tries to advance her singing career."
Non-Myths About Programming by Mordechai (Moti) Ben-Ari, from Communications of the ACM July 2011

Sometimes, commencement addresses and keynote speeches are worth hearing, and worth reading later on if you're not lucky enough be there live. It depends entirely on who's speaking, and Mordechai Ben-Ari is one of the best. The mere fact that his articles appear from time to time in the pages of the Communications of the ACM makes it worthwhile to become a member.

Anyway, I'll let Mordechai provide the setup for today's quote:

The decline of interest in studying computer science is usually attributed to a set of perceptions that students have about the subject. Many educators react to these perceptions as if they were myths and try to refute them. I believe the perceptions of students are roughly true when viewed in isolation, and that the proper way to address these non-myths is to look at them within the context of "real life."

To understand Tiffany and how she got involved in Mordechai's keynote, take a moment to read the short play presented before Mordechai began speaking: "Aunt Jennifer".

Tiffany appears throughout Mordechai's speech, with today's quote being one example; here's the wider context:

Non-Myth #7. Programming Is a Well-Paid Profession

That's great. Potential earnings shouldn't be the only consideration when choosing a profession, but it is not immoral to consider what sort of future you will be offering your family. It would be a good idea to remind Tiffany that the chasm between the life-styles of her mother and Aunt Jennifer is not the result of luck.

I recently read the controversial book Freakonomics by Steven D. Levitt and Stephen J. Dubner. The third chapter - "Why Do Drug Dealers Still Live with Their Moms?" - based upon the work of sociologist Sudhir Venkatesh is quite relevant to the issue of potential earnings. As a graduate student, Venkatesh was able to observe and document the lives of the members of a drug gang, and he eventually obtained their financial records. These were analyzed by Levitt, an economist, who came up with the following conclusion, expressed as a question: So if crack dealing is the most dangerous job in America, and if the salary was only $3.30 an hour, why on earth would anyone take such a job? The answer: Well, for the same reason that a pretty Wisconsin farm girl moves to Hollywood. For the same reason that a high-school quarter-back wakes up at 5 A.M. to lift weights. They all want to succeed in an extremely competitive field in which, if you reach the top, you are paid a fortune (to say nothing of the attendant glory and power). The result: The problem with crack dealing is the same as in every other glamour profession: a lot of people are competing for a very few prizes. Earning big money in the crack gang wasn't much more likely than the Wisconsin farm girl becoming a movie star or the high-school quarterback playing in the NFL.

Ambition to succeed in a glamour profession is not something to be deplored, but a young person must receive advice and support on what to do if she is not the 1 in 10,000 who succeeds. If Tiffany wants to become a professional singer, I would not try to dissuade her, but I would prefer that she pursue a CS degree part time while she tries to advance her singing career.

Next week: Kinshuman Kinshumann

Wednesday, August 24, 2011

Just another Q&A site: StackOverflow fades from view

Here's a short story about how long it takes for old folks to learn new tricks:

I was trying to launch http://localhost/ in the default browser from inside a Windows batch file. A Windows shortcut specifying only http://localhost/ worked, but coding the same thing inside a batch file:

resulted in this error message:
'http:' is not recognized as an internal or external command,
operable program or batch file.

Press any key to continue . . .
So I went to Google (one of those new tricks that old folks take a long time to learn)...
how do I launch a web page from a batch file?
Lots of hits, but nothing useful, and nothing at all from either or, so...

It was time for an epiphany: Aha! Perhaps I could create a shortcut and have the batch file launch that!

Time to refine the search (another one of those new tricks for old folks)...
how do I launch a shortcut from a batch file?
Bingo! Among the answers was a clue that the shortcut name wasn't *.lnk like it would be for a shortcut to a batch file, it was *.url, so this batch file worked:

That's nice, but it's not the point, is it?

No, that's not the point here. The point is that if StackOverflow and ServerFault were living up to their lofty goals, they would have appeared in the Google searches.

And they didn't. And they haven't appeared in Google searches for quite some time... and I haven't bothered to go directly to those websites for quite some time, either.

Why is that?

Out of almost 2 million questions on StackOverflow, nothing came close to answering this question
how do I launch a shortcut from a batch file?
but that's to be expected, that isn't a "real programming question" in the minds of the jerks who run StackOverflow the folks who frequent StackOverflow.

How about ServerFault? That should be a better place to ask this question... but alas, no. Perhaps it is because there are only 87,000 questions on that website, an astonishingly small number considering its broad scope and the fact that it's been around for over two years.

Why didn't you ask the question on ServerFault?

Because I already had my answer. I got it in a very short time using Google.

Why don't you post the answer on ServerFault?

Are you kidding me? Go ahead, try to post a question-with-answer on StackOverflow or ServerFault, you will be slagged as being a "reputation whore" who just wants points and badges. I made that mistake once, never again.

Back on topic...

In the words of Joel Spolsky, here is the biggest, most lofty goal ever expressed for the internet as a whole, let alone for a single website:
"Our core mission at Stack Overflow is: Make the Internet a better place to get expert answers to your questions."
Well, StackOverflow is still a good source for information, but so are any number of a zillion other Q&A websites out there. StackOverflow and ServerFault (and all of their StackExchange spawn?) are fading from view among the wonder of the interweb.

Memo to Joel: Google Search is achieving that goal, not you. In fact, you're being left behind... your websites aren't showing up in Google searches.

Oh, and good luck with that venture capital thing... the software's great, but StackExchange ain't gonna be The Next Big Thing.

You shoulda stuck to your original software-as-a-service business model... you never gave it a chance, you never went off beta with it.

Monday, August 22, 2011

The New MaxBPs=768: Set MAX_QUERY_TASKS = '1'

Once upon a time, Windows developers encountered strange runtime errors that were solved by this variation of "waving a dead chicken over the keyboard":

January 24, 1994

INFO WORLD Volume 16, Issue 4

Windows Manager/Brian Livingston


For years, Windows users have been mystified by strange error messages.
For no apparent reason, you may be confronted with a dialog box such
as, "Out of memory," "This application has violated system integrity and
will be closed," or "Application error." A quick check of the Help About
box in any Windows applet reveals that you have plenty of memory and Free
System Resources. Yet these nonsensical messages appear, after which
Windows behaves erratically and must be restarted.
I've finally found the cause of most of these messages. Best of all, you can
correct the problem by inserting a single command into the [386Enh] section
of your SYSTEM.INI file:


Now, for SQL Anywhere, there's a new magic command that can sometimes help when nothing else seems to work:
Let's say you have a query that takes way too long; for example,
Execution time: 645.493 seconds
instead of
Execution time: 0.422 seconds
Maybe the query uses too much (or too many) CPUs, too much disk I/O, too much RAM cache...

Or, maybe, your server goes completely unresponsive after executing the query for a long time.

Before investigating, before studying plans or applying EBFs or upgrading to a new version, try something simple...

Try turning off intra-query parallelism

If you know which SQL statement is causing the problem, put this SET command ahead of it to turn off intra-query parallelism for the current connection:
If it doesn't help, take it out.

If you don't know exactly where the problem is, try the nuclear option (turn off intra-query parallelism for all connections, for all time):
If that doesn't help, turn intra-query parallelism back on (the default):
Either way, it didn't hurt anything, maybe it helped, and it didn't take much of your time.

And now, something for the unbelievers...

If you don't believe that SQL Anywhere ever goes runaway, or worse, unresponsive, and even if it did, you don't believe in The Power Of The Dead Chicken to fix the problem, there is another reason to mess with MAX_QUERY_TASKS:

To prevent any single query from unfairly monopolizing too many CPUs, set MAX_QUERY_TASKS to some number smaller than the number of CPUs that SQL Anywhere is using... say 1/2, or 1/4, or maybe even just 2...
Here's the thinking behind the number '2': it lets a single query use twice as many CPUs as it would have if intra-query parallelism didn't exist, but it also lets other queries have a piece of the pie.

For the Boss, of course, there's always this:

Friday, August 19, 2011

The fRiDaY File - Yes, maybe there is something to this UltraLite thing

Perhaps, someday, my nightmares memories of programming for PDAs mobile devices will fade, and I will finally forget the disaster that was 1999...

From Idea to iPhone in Seven Days

Maybe there is something to this UltraLite thing after all...

From UK hospitals find cure for test data delivery via mobile BI by Brian McKenna, 8 Aug 2011:

Ecommnet developed the iPhone-based application in seven days, in time for a health care trade conference earlier this year.

“We take a complex understanding of a back-end database [in this case Oracle]. We can see what the core bits of data really are and translate that to something that will work on a mobile device, with a small screen,” he said. In this case one goal was “to present the information sensibly to clinicians. It had to work simply for the consultants, so we had to think like them.”

Campbell knew users wouldn't have constant connectivity as the data was moved between the PAS back-end Oracle database and the iPhones. That meant the system needed a reliable store and forward capability.

Sybase's SQL Anywhere’s provided a rapid store and forward database and data synchronization platform.

"By using SQL Anywhere, we were able to delegate the issue of data transmission between the mobile device and the back-end database to Sybase technology, instead of having to spend a lot of time that we really didn’t have to get that communication working," Campbell said, in a Sybase case study article.


Thursday, August 18, 2011

The Thursday Quote - HAgoogLe

"It can only be attributable to human error."
HAL as quoted here from 2001: A Space Odyssey, 1968.
"Safety is our top priority. One of our goals is to prevent fender-benders like this one, which occurred while a person was manually driving the car."
from a statement by Google, as quoted by Sajid Farooq and Jason Middleton in Google Blames Human for Robot Car Crash Friday, Aug 5, 2011.

HAL was lying, but Google is telling the truth... right?

Here is an excerpt from the wider context:

Google Blames Human for Robot Car Crash

UPDATE: A woman claiming to have witnessed the accident involving one of Google's robot-controlled cars says that five cars were involved, not two, as Google asserts in a statement.

There was "a huge screeching noise," according to Tiffany Winkelman, and Google's Prius struck another Prius, which then struck her Honda Accord that her brother was driving. That Accord then struck another Honda Accord, and the second Accord hit a separate, non-Google-owned Prius.

Google's original statement reads: "Safety is our top priority. One of our goals is to prevent fender-benders like this one, which occurred while a person was manually driving the car."
. . .
At Google, don't blame the technology when something goes wrong -- blame the person behind the technology.

Robot cars? Really?

Yes, really, there are robot cars running around in California... and, apparently, slamming into other cars. Nothing new there, human drivers have been slamming into each other for a hundred years.

"For the past year, Bay Area residents have noticed a fleet of seven curious-looking Toyota Priuses outfitted with an array of sensors, sometimes spotted driving the highways and city streets of San Francisco, occasionally even swerving their way down the notoriously serpentine Lombard Street."
Automotive Autonomy - Self-driving cars are inching closer to the assembly line, thanks to promising new projects from Google and the European Union by Alex Wright, from Communications of the ACM July 2011

Yes, it has taken a long time for robots to catch up to humans, but progress is being made.

Consider tailgating: once the exclusive domain of impatient humans on their way to ... somewhere, now robot "highway trains" are going to make tailgating a way to improve fuel efficiency and (one supposes) go green, fight global warming and save the polar bear all at the same time.

Don't believe me, read Alex Wright's article; here are some snippets (the emphasis is mine):

The team has been trying to understand the psychological impact of autonomous driving on the human occupants formerly known as drivers.
. . .
The European Union-sponsored SARTRE project is developing technologies to allow cars to join organized platoons, with a lead car operated by a human driver.
. . .
If the Google project ultimately comes to fruition, it may do more than just improve the lives of individual car owners; it could also open up new possibilities for car sharing and advanced "highway trains" in which cars follow each other on long-distance trips, improving fuel efficiency and reducing the cognitive burden on individual drivers.
. . .
By grouping cars into platoons, the SARTRE team projects a 20% increase in collective fuel efficiency for each platoon.
. . .
At press time, Google had hired a lobbyist to promote two robotic car-related bills to the Nevada legislature. One bill, an amendment to an existing electric vehicle law, would permit the licensing and testing of self-driving cars. The second is an exemption to allow texting during driving.
. . .
The team is also considering the potential impact on other drivers outside the platoon, since the presence of a long train of vehicles will inevitably affect other traffic on the freeway. For example, if the platoon is traveling in the slow lane on a multilane freeway, it will inevitably have to react to occasional interlopers.

What could possibly go wrong?

Next week: The Thursday Quote - Mordechai (Moti) Ben-Ari

Wednesday, August 17, 2011

Top 10 Colossal Cloud Outages

It's been a while since anything's been marked "Database Deadpool" in this blog, and I thought it might be time to retire the tag... until stumbling on this slideshow:

The 10 worst cloud outages (and what we can learn from them)
It raises this question, "If these guys can have this kind of problem, what about me?"

What Guys?

These guys...
  1. Amazon (AWS)

  2. Microsoft (Sidekick)

  3. Google (Gmail)

  4. Microsoft (Hotmail)

  5. Intuit

  6. Microsoft (BPOS)


  8. Terremark

  9. PayPal

  10. Rackspace about them here.

When was the last time YOU tested recovery?

Monday, August 15, 2011

Never throw out those old setup CDs!

At least, never throw them out if you might have to answer questions about those old versions.

Take SQL Anywhere Version 9, for example. Nobody in their right mind is using the 9.0.0 GA release any more, they have all upgraded to 9.0.2. Even if they haven't you can still deal with questions involving 9.0.0 if you have access to 9.0.2, right?

Well, not if you do a fresh install of 9.0.2 on Windows 7 and you want to look at the demo database, you can't: it isn't there.

Which, in the grand scheme of things, isn't a big deal... who cares about the demo database?

That's Not The Point

The demo database is not the point, the point is that the old setup for 9.0.0 is required for full access to 9.0.2 features in this one obscure instance, and there might be others.

There might be other instances where the version X.0.0 or X.0.1 setup is necessary to get past a problem with X.0.2... so, until you have moved everything to X+1, don't dump the X CDs.

Case in point...

Question: How do I do a fresh install of SQL Anywhere 9 on Windows 7 and still have the demo database?

Answer: Step by (boring) step...
  1. Install SQL Anywhere 9.0.0.

  2. Make a backup copy of this file:
    C:\Program Files (x86)\Sybase\SQL Anywhere 9\asademo.db

  3. Uninstall "SQL Anywhere Studio 9, Software" via Control Panel - Programs and Features.

  4. Install SQL Anywhere 9.0.2.

  5. Apply the EBF for to get the most recent (and last) build of 9.0.2.
    The EBF setup will say it's going to move asademo.db to a new location:

    but it's lying because the file isn't there... it wasn't there after the 9.0.2 install (and it wasn't anywhere else, either):

    so if you try to launch All Programs - SQL Anywhere 9 - Adaptive Server Anywhere - Personal Server Sample you get this surprise:

  6. Copy the backup copy of asademo.db to this folder:
    C:\Users\Public\Documents\SQL Anywhere 9\Samples

  7. Now the standard "SQL Anywhere 9" shortcuts and ODBC DSNs work because the file is there.

Maybe there's another way past this glitch, one that doesn't require 9.0.0... that's not the point either, the point is that having the 9.0.0 setup made this path possible. One alternative path is to upgrade from 9.0.0 to 9.0.2... but that still requires the 9.0.0 setup and besides, it doesn't work on Windows 7, go ahead and try it (I did, and there's a half-hour I'm never going to get back :)

Yes, I have all the old CDs for SQL Anywhere back to 5.5...

...but I do care about all viable versions of SQL Anywhere.

Friday, August 12, 2011

The fRiDaY File - Persian Is Perfect For Documenting Dot Net

I was just surfing one of my favorite Persian .NET blogs (what? you say that's surprising on several levels?)

when I stumbled on something quite interesting: a link with the label "Sybase ADDICT: Using NHibernate 3.2.0 GA with SQL Anywhere 12.0.1".

That's Amazing!

Another website devoted to SQL Anywhere!

No, wait a minute, something evil strange interesting is going on here.

Just who is this Sybase Expert, and why does his/her "Sybase Sql Anywhere" category look a lot like Glenn Paulley's blog, except for the fact that all of Glenn's articles (and Chris Kleisath's, and... ) are now marked "by SybaseExpert"?

Oh, relax!

Glenn (and Chris, and... ) should be proud that someone thinks so highly of their work that they reformat and republish it as their own.

I'm jealous: Sybase Expert hasn't picked any of my articles to steal republish.

I am not worthy!

Latest SQL Anywhere EBFs: A Busy Month!

The three asterisks "***" indicate what's new since July 18... six EBFs in three weeks must be some kind of a record!

Only EBFs for the three fully-supported versions of SQL Anywhere are shown here: 12.0.1, 11.0.1 and 10.0.1.

To find out what you're running just do SELECT @@VERSION

Current builds...

HP-UX Itanium EBF ***             08 Aug 2011 ***
        EBF                 14 Oct 2008

HP-UX PA-RISC EBF                 16 Oct 2008

IBM AIX          12.0.1 GA Upgrade from 12.0.0   15 Mar 2011
        EBF ***             26 Jul 2011 ***

Linux Itanium EBF                 14 Oct 2008

Linux x86/x64 EBF                 01 Apr 2011
        EBF ***             29 Jul 2011 ***
        EBF                 17 Jun 2011

Mac OS  EBF                 26 May 2011
        EBF                 29 Jun 2010
        EBF                 01 Oct 2010

NetWare EBF                 10 Mar 2010

Solaris SPARC    12.0.1 GA Upgrade from 12.0.0   15 Mar 2011
        EBF                 07 Jul 2011
        EBF                 17 Jun 2011

Solaris x64 EBF ***             05 Aug 2011 ***
        EBF                 01 Apr 2009

Windows CE EBF                 09 Mar 2010

Windows Itanium  10.0.1 GA Upgrade from 10.0.0   26 Apr 2007

Windows x86/x64 EBF ***             29 Jul 2011 ***
        EBF ***             08 Aug 2011 ***
        EBF                 17 Jun 2011

Thursday, August 11, 2011

The Thursday Quote - Tim Wu

"This is also why, for example, AT&T never invented the Internet, even though it clearly had the chance."
Bell Labs and Centralized Innovation by Tim Wu, from Communications of the ACM May 2011

Do you remember the first automatic telephone answering machines that used magnetic recording tape? They hit the market when? Sometime in the 1980's, right?

What if I told you that AT&T had an operational prototype up and running and being used by an employee back in 1935?

You did not know that, did you? Here's why...
"AT&T firmly believed magnetic tape and the telephone were fundamentally incompatible technologies. The widespread usage of magnetic recording technology, AT&T believed, would lead Americans to abandon the telephone."
Here's the "AT&T never invented the Internet" quote in a wider context:

Centralized systems of innovation are excellent for certain types of research. Yet they also have, as it were, one fatal flaw, one that we can see clearly in the story of AT&T and its Bell Labs. Yes, Bell Labs was great. But at the same time, Bell Labs was never a place that could originate technologies that could, by the remotest possibility, threaten the Bell system itself. The truly disruptive technologies—those that might even cast a shadow of uncertainty over the business model—were out of the question.

This is also why, for example, AT&T never invented the Internet, even though it clearly had the chance. In the 1960s, men like scientist Paul Baran spent years trying to convince AT&T that packet-switching technologies were a step forward that would improve the network, but AT&T regarded the idea as preposterous. "Their attitude," Baran said in a later interview "was that they knew everything and nobody outside the Bell System knew anything. So here some idiot comes along and talks about something being very simple, who obviously doesn't know how the system works."

Next week: Hagoogle

Tuesday, August 9, 2011

Scrolling In A Partition: Is There A Better Way?

Question: How do I determine the primary key that is 100 rows newer than the current row, for a subset of rows?

The following solution seems to work AND it's a fairly snappy performer in SQL Anywhere; the question is, is there a better solution?

IF VAREXISTS ( '@partition_id' ) = 0 THEN

IF VAREXISTS ( '@current_primary_key' ) = 0 THEN

SET @partition_id        = 2;

SET @current_primary_key = 290505;

SELECT newer_t.primary_key AS destination_primary_key
                ROW_NUMBER() OVER ( ORDER BY T.primary_key ASC ) AS scrolling_row_number,
           FROM t
          WHERE t.partition_id = @partition_id
            AND t.primary_key  > @current_primary_key
          ORDER BY t.primary_key ASC 
       ) AS newer_t
 WHERE newer_t.scrolling_row_number = 100
 ORDER BY newer_t.primary_key DESC;
I'm sure all you front-row students out there already have a suggestion or two, but the rest of us need some more information...

Here's the table:
CREATE TABLE DBA.t ( -- 487,445 rows, 54.5M total = 37.4M table + 0 ext + 17M index, 118 bytes per row
   partition_id    /*         X */ UNSIGNED INT NOT NULL,
   primary_key     /* PK        */ UNSIGNED BIGINT NOT NULL DEFAULT autoincrement,
   other_data_1                    VARCHAR ( 1 ) NOT NULL DEFAULT 'N',
   other_data_2                    VARCHAR ( 1 ) NOT NULL DEFAULT 'Y',
   other_data_3                    VARCHAR ( 32767 ) NOT NULL DEFAULT '',
   other_data_4                    BIGINT NOT NULL,
   other_data_5                    BIGINT NOT NULL,
   other_data_6    /*         X */ TIMESTAMP NOT NULL DEFAULT '1900-01-01',
   other_data_7                    TIMESTAMP NOT NULL DEFAULT '1900-01-01',
   other_data_8                    TIMESTAMP NOT NULL DEFAULT '1900-01-01',
   other_data_9                    TIMESTAMP NOT NULL DEFAULT '1900-01-01',
   other_data_10                   BIGINT NOT NULL DEFAULT 0,
      primary_key )
CREATE CLUSTERED INDEX ix_other_data_6 ON DBA.t ( -- 12.7M
   other_data_6 );

CREATE INDEX ix_partition_id ON DBA.t ( -- 3.5M
   partition_id );
Here's the question again:

Question: How do I determine the primary key that is 100 rows newer than the current row, for a subset of rows?

Here are some definitions:
  • Current row: where the autoincrementing primary_key = x

  • Desired result: y = the destination primary key, or NULL if there are fewer than 100 newer rows

  • Newer: where the primary_key is > x

  • Subset: where the partition_id = z

Here are some facts and figures:
  • The test database has 487,445 rows in this table; in the real world millions of rows are common.

  • The test database has only 4 distinct values in partition_id, with two values alternating throughout almost all of the rows: 1, 2, 1, 2, ... for primary keys n, n + 1, n + 2, n + 3, ...

  • The partition_id column doesn't work in the sense of horizontal partitioning or sharding. In some real-world examples of this table all the rows have the same partition_id, in other cases there are many values of partition_id but only a few account for most of the rows, and in this case (as noted above) the rows are evenly divided between two partition_id values.

  • The primary_key may be a perfect candidate for a CLUSTERED index, but one of the other columns is used in expensive range queries (not discussed here) so it got the CLUSTERED index. Sadly, there is apparently a world-wide shortage of the "CLUSTERED" keyword so it has been decreed that only one index per table can be CLUSTERED...

  • ... but perhaps that doesn't matter if I read Anil Goel's comment on this post correctly: Multiple Clustered Indexes
Here's the proposed solution again, followed by a discussion of how it works:
SELECT newer_t.primary_key AS destination_primary_key
                ROW_NUMBER() OVER ( ORDER BY T.primary_key ASC ) AS scrolling_row_number,
           FROM t
          WHERE t.partition_id = @partition_id
            AND t.primary_key  > @current_primary_key
          ORDER BY t.primary_key ASC 
       ) AS newer_t
 WHERE newer_t.scrolling_row_number = 100
 ORDER BY newer_t.primary_key DESC;
  • The ROW_NUMBER() OVER clause on line 3 assigns a row number 1, 2, 3, ... for the rows in the inner SELECT.

  • It is important (I think, at least in this case) for the OVER ORDER BY on line 3 to be the same as the SELECT ORDER BY on line 8. If they're different (in this case) funky things happen with regards to returning the right answer (that part, I know for sure).

  • The predicate t.partition_id = @partition_id on line 6 limits the candidate rows for the inner SELECT to exactly one partition.

  • The predicate t.primary_key > @current_primary_key on line 7 limits the candidate rows to "newer" rows.

  • The SELECT TOP 100 clause on line 2 limits the inner result set to the first 100 "newer" rows.

  • The WHERE clause on line 10 throws away the first 99 "newer" rows and takes row number 100.

  • The outer ORDER BY on line 11 is an "oops"... clearly the outer SELECT don't need no steenking ORDER BY. The query engine doesn't eliminate the clause, but at the same time (judging by the plans) it doesn't spend any time executing it either.
Speaking of oops-es, the CREATE INDEX ix_partition_id ON DBA.t ( partition_id ) shown earlier is probably useless. The predicate partition_id = any_value is going to have a worst case of 100% true, 50% true for this test database, and somewhere in the range of 1% to 10% true at the best of times. That's not exactly what the query optimizer looks for in a great nonclustered index, it's more interested in 0.01% true, otherwise it leans towards a sequential scan...

...or am I completely out to lunch? That's why I'm asking this question!

Three graphical plans are available, for
  • a run with a cold cache,

  • followed by an immediate repeat of exactly the same query, and

  • then a test of the "next step forward": the destination_primary_key returned by the previous query was fed into the @current_primary_key for the step forward.
The cold cache plan is slightly different, and the overall run times slightly longer, than the subsequent two queries run using a warm cache:
               Estimated   Actual
                RunTime    Runtime
                =======    =======
Cold cache      10.418     1.7735  
Repeat           1.7516    0.55487
Step forward     1.4718    0.39121
You can look at the plans in dbisql; here's where to get them:


Here's a quick look at table t for the first two plans (cold cache and immediate repeat); you can see that when the cache warmed up the table scan was replaced by an index:

So, besides getting rid of the oops-es, what else can I do or try?

Monday, August 8, 2011

Conference Rates Expire This Week

Two reminders:

  • The cutoff date to book a room for $169/night plus tax at the Venetian is Wednesday, August 10, or until the Sybase TechWave block is full, whichever comes earlier.

  • The cost to attend Sybase TechWave 2011 is $1195 until August 11 and $1395 thereafter; register here.
...and one piece of news:

Friday, August 5, 2011

The fRiDaY File - Why The 90-9-1 Rule Is Optimistic

Characteristic Errors, Revisited

UPDATE: See the latest version of this article here.

When this article was first published last December it had 16 items, now it has 22...

A characteristic error is an error that is so easy to make that it appears you are being actively encouraged to make it by the very nature of the computer program you are using.

For example, sending an email without the attachment is a characteristic error of all email programs.
...except Gmail. Gmail warns you about missing attachments... Gmail is magic!
Here are some errors that are characteristic of SQL in general, and SQL Anywhere in particular.
  1. SQL: Seeing too little data, or no data at all, because a predicate in the WHERE clause effectively turned your OUTER JOIN into an INNER JOIN.

  2. SQL: Seeing too much data because a missing predicate effectively turned your INNER JOIN into a CROSS JOIN.

  3. SQL: Getting the wrong COUNT() or SUM() because you forgot to code WHERE ... IS NOT NULL, or you *did* code it when you shouldn't have.

  4. SQL: Getting the wrong answer because you forgot that, in general, NULL values [cough] suck.

  5. SQL Anywhere: Not seeing MESSAGE output because you forgot to run SET TEMPORARY OPTION DEBUG_MESSAGES = 'ON';

  6. SQL Anywhere: Not seeing any data because you forgot ON COMMIT PRESERVE ROWS or NOT TRANSACTIONAL.

  7. SQL Anywhere: Coding ENDIF where END IF was required, or vice versa (before Version 11).

  8. SQL Anywhere: Connecting to the wrong server because you forgot DOBROAD=NONE (before Version 12).

  9. SQL Anywhere: Forgetting the asterisk in SELECT TOP 10 FROM ...

  10. SQL Anywhere: Coding IF NOT VAREXISTS ( 'x' ) THEN ... instead of IF VAREXISTS ( 'x' ) = 0 THEN ...

  11. SQL Anywhere: Coding the wrong magic numbers 1, 2, 3, ... in the get_value() and set_value() calls in an EXTERNAL C DLL function.

  12. SQL Anywhere: Getting proxy table ODBC errors because the engine's running as a service and you've set up a User DSN instead of System DSN.

  13. SQL Anywhere: Getting file-related errors because the file specifications are relative to the server rather than the client.

  14. SQL Anywhere: Getting file-related errors because the engine's running as a service without the necessary permissions.

  15. SQL Anywhere: Coding CREATE TRIGGER IF NOT EXISTS instead of CREATE OR REPLACE TRIGGER, or vice versa for CREATE TABLE (in 11.0.1 or later).

  16. SQL Anywhere: Getting integer arithmetic when you wanted fractional parts because you forgot to CAST.

    New entries...

  17. Stored procedure debugger: Setting it to watch a specific user id other than the one you're using to test your code.

  18. Sybase Central: Setting it to display objects for owner names other than the one you're interested in.

  19. Copy and paste: Forgetting to edit after pasting; e.g., Copy and paste SET @continue = 'Y' into the body of a WHILE loop and then forgetting to change it to 'N'.

  20. MobiLink: Forgetting to call ml_add_column for any of the columns you're trying to synchronize, thus guaranteeing yourself a "Sassen Frassen Fricken Fracken!" moment when you run the first test.

  21. MobiLink: Forgetting to call ml_add_[various] with the NULL parameter to delete old ml_[whatever] rows, thus ending up with thousands of orphan system table rows in the consolidated database.

  22. OLAP Windowing: Coding the wrong combination of ASC and DESC in an inner OVER ORDER BY clause and the outer SELECT ORDER BY: different when they should be the same, the same when they should be different, or some other variation of "wrong combination"...
                SELECT older_sample_set.sample_set_number
                  INTO @20_older_sample_set_number
                  FROM ( SELECT TOP 20
                                ROW_NUMBER() OVER ( ORDER BY rroad_sample_set.sample_set_number ASC ) AS scrolling_row_number,
                                rroad_sample_set.sample_set_number                                    AS sample_set_number
                           FROM rroad_sample_set
                          WHERE rroad_sample_set.sampling_id       = @sampling_id
                            AND rroad_sample_set.sample_set_number < @sample_set_number
                          ORDER BY rroad_sample_set.sample_set_number DESC ) AS older_sample_set
                 WHERE older_sample_set.scrolling_row_number = 20;

Thursday, August 4, 2011

The Thursday Quote - Jakob Nielsen

"In most online communities, 90% of users are lurkers who never contribute, 9% of users contribute a little, and 1% of users account for almost all the action."
Participation Inequality: Encouraging More Users to Contribute by Jakob Nielsen, from Jakob Nielsen's Alertbox October 9, 2006

That's the "executive summary" for Jakob Nielsen's fascinating article about the 90-9-1 rule; here's another snippet:

User participation often more or less follows a 90-9-1 rule:
  • 90% of users are lurkers (i.e., read or observe, but don't contribute).

  • 9% of users contribute from time to time, but other priorities dominate their time.

  • 1% of users participate a lot and account for most contributions: it can seem as if they don't have lives because they often post just minutes after whatever event they're commenting on occurs.

Does that sound like any online community you are familiar with?

Yes, it does... but maybe not this one:

Next week: Tim Wu

Wednesday, August 3, 2011


Folks familiar with how fast UNLOAD and LOAD TABLE run will not be surprised that they're often faster than INSERT SELECT via proxy tables.

An earlier article showed how to take the next step: Use UNLOAD COMPRESSED to make the intermediate file smaller and the process faster, and a followup article talked about UNLOAD COMPRESSED in action.

Then, John Smirnios suggested a third step: use a named pipe instead of a physical file to make the process even faster.

Does A Named Pipe Make A Difference?

It sure does; UNLOAD and LOAD using a named pipe beats UNLOAD and LOAD COMPRESSED using a physical file by a significant margin... 39% for the comparison shown in Figure 1.

Figure 1: UNLOAD COMPRESSED Versus Named Pipe

And no, you don't have break out Visual Studio and sharpen up your multiprogramming C skills... you can do it in SQL. There is a simple demo in the previous article, and this article presents a similar technique, in action, in the real world, in Listing 1.

Listing 1: UNLOAD and LOAD TABLE Via Named Pipe


USING 'DRIVER=SQL Anywhere 12;ENG=old_foxhound1;DBN=old_f';


SET @sql = 'UNLOAD TABLE DBA.rroad_group_1_property_pivot TO ''\\\\.\\pipe\\data''';

SET @sql = REPLACE ( @sql, '''', '''''' );
SET @sql = REPLACE ( @sql, '\\', '\\\\' );
SET @sql = REPLACE ( @sql, '\\', '\\\\' );

SET @sql = 'FORWARD TO old_foxhound1 ''' + @sql + '''';

TRIGGER EVENT rroad_execute_immediate ( @sql = @sql );

LOAD TABLE rroad_group_1_property_pivot FROM '\\\\.\\pipe\\data';

You can read about the mechanics of the FORWARD TO and TRIGGER EVENT statements in the previous arcticle about using named pipes, and in the Help: TRIGGER EVENT and FORWARD TO.
Tip: Don't be discouraged by this in the Help: "The FORWARD TO statement is a server directive and cannot be used in stored procedures, triggers, events, or batches"... FORWARD TO works just fine inside a stored procedure if you wrap it in EXECUTE IMMEDIATE.
The TRIGGER EVENT on line 18 was written with "@sql = @sql" on purpose, not to be confusing (yes, it is, I know) but to show how the name of the special pseudo-parameter (the first @sql) has nothing to do with name of the actual argument (the second @sql). They are different variables, with different scopes, even though they look the same. Magic is sometimes (always?) confusing.

Listing 2 shows the event itself, a general-purpose module that runs EXECUTE IMMEDIATE on a separate connection.

Listing 2: Run EXECUTE IMMEDIATE Asynchronously

CREATE EVENT rroad_execute_immediate


SET @sql = COALESCE ( EVENT_PARAMETER ( '@sql' ), '' );

IF @sql <> '' THEN


Now there's a THIRD "@sql", a local variable that receives its value from the magic pseudo-parameter @sql via EVENT_PARAMETER(). OK, I promise... this is the first and last time I'll use the same name for three different things :)

Question: Why is FORWARD TO necessary?
Answer: The whole process is under the control of code running in the target database (Listing 1) but the UNLOAD statement must run on the source database. The FORWARD TO statement sends it over there.

Question: Why is EXECUTE IMMEDIATE necessary?
Answer: Because it is... necessary, that is. For some unexplained reason, you can't run a naked FORWARD TO inside a stored procedure, you have to put some pants on it; i.e., wrap it in an EXECUTE IMMEDIATE.

Question: Why is an EVENT necessary?
Answer: Because when you run the EXECUTE IMMEDIATE it will sit there and wait until the FORWARD TO is finished... and the FORWARD TO will sit there and wait until the UNLOAD statement is finished, and the UNLOAD statement will run until the last byte is written to the named pipe. The EVENT runs on a separate connection, so all this waiting is done on that connection, leaving the original connection (the one running the code in Listing 1) to get on with life... to get on with running the LOAD TABLE on line 20. In other words, the code in Listing 1 can start reading data from the named pipe before the UNLOAD statement has finished writing it; Figure 2 shows one CPU busy with the UNLOAD and another one busy with the LOAD TABLE.

Note: The EVENT, the FORWARD TO and the EXECUTE IMMEDIATE all run on the target database, as does the LOAD TABLE; only the UNLOAD statement runs on the source database.

Figure 2: UNLOAD and LOAD TABLE Via Named Pipe In Action

What's Wrong With This Picture?

Figure 1 isn't exactly comparing apples-to-apples: the UNLOAD COMPRESSED test was run using two separate SQL Anywhere 12 servers on one computer, one to do the UNLOAD and the other to do the LOAD, while only one server was used to run both databases in the named pipe test. Part of the 39% savings might have come from switching to a single server, OR the savings might have been larger if the UNLOAD COMPRESSED test had also been run on one server, OR it might have made no difference... only further experimentation will determine that.

It's not like SQL Anywhere was starved for CPU or RAM (Intel Core i7 with 8G) but it WAS starved for disk resources. Figure 2 shows that the Disk Queue Length regularly hit 5: that's terrible; 0.5 is a better number, 0.05 even better. Five means a whole bunch of work is queued up for disk resources, something which happens regularly when you're unloading data from one database disk file and loading it into another at the same time, using the same physical disk drive... like in Figure 2.

So yeah, if I was serious about performance, I'd be using two drives, wouldn't I?

[fretful sighing sound]

...performance and tuning never ends :)

And The Answer Is... Door Number Two!

It turns out the correct answer is "... OR the savings might have been larger if the UNLOAD COMPRESSED test had also been run on one server". Figure 3 shows the times from Figure 1 as Test A and Test C, with Test B in the middle showing the times for UNLOAD COMPRESSED where both databases were running on the same SQL Anywhere server.

Figure 3: UNLOAD COMPRESSED on 1 and 2 Servers Versus Named Pipe

So yes, the 39% figure is correct, and yes, UNLOAD and LOAD using a named pipe does beat UNLOAD COMPRESSED.

Question: Why is only one server used for the named pipe test?
Answer: Because it has to be that way... the \\.\pipe\data thing just does not work if the UNLOAD is running on one server and the LOAD TABLE on another, even if both servers are running on the same computer. Perhaps by the time you read this, there will be an answer posted to this question on the SQL Anywhere Forum.

Who knows, maybe it will be faster using two servers, the UNLOAD COMPRESSED test certainly was.

Monday, August 1, 2011

Congratulations, Peter Bumbulis!

United States Patent 7,962,693
Bumbulis June 14, 2011
Cache management system providing improved page latching methodology

A cache management system providing improved page latching methodology. A method providing access to data in a multi-threaded computing system comprises: providing a cache containing data pages and a mapping to pages in memory of the multi-threaded computing system; associating a latch with each page in cache to regulate access, the latch allowing multiple threads to share access to the page for reads and a single thread to obtain exclusive access to the page for writes; in response to a request from a first thread to read a particular page, determining whether the particular page is in cache without acquiring any synchronization object regulating access and without blocking access by other threads; if the particular page is in cache, reading the particular page unless another thread has exclusively latched the particular page; and otherwise, if the particular page is not in cache, bringing the page into cache.

Inventors: Bumbulis; Peter (Cambridge, CA)
Assignee: Ianywhere Solutions, Inc. (Dublin, CA)
Appl. No.: 12/122,688
Filed: May 17, 2008

Patents are a BIG deal!

If patents weren't important, would Scott Adams have mentioned them 32 times?

I rest my case!