There have been several articles in this blog on the subject of Intra-Query Parallelism (using more than one CPU to handle a single request simultaneously) in SQL Anywhere... there has even been a fRiDaY File on the subject.
This article is about a real-world client experience with
- unusually high CPU peaks accompanied by reports of
- "very, very slow" application performance; e.g., 2 minutes to login instead of fractions of a second
The server-level performance statistics displayed by Foxhound did not reveal any clues, nor did a cursory examination of the connection-level statistics... no obvious runaway connections, no apparent "Queries From Hell".
The "Temporary connections are named" enhancement in SQL Anywhere 12.0.0 introduced CONNECTION_PROPERTY ( 'Name' ) values like 'INT:Exchange' to help identify what each internal connection is doing, plus the 'ParentConnection' property to help identify the connection that started each internal connection. However, those properties weren't available in the SQL Anywere 10.0.1.3831 database being studied, so Foxhound could only show these connections as nameless standalone entries in the connection section at the bottom:
- The top line at "1:29:21 PM" shows a CPU Time peak of 50.8%, and the bottom section shows four connections that were active when the top sample was recorded.
- The "Conns 57" number in the top section disagrees with the "61 connections" title in the bottom section because the database-level ConnCount property doesn't include temporary connections.
- The bottom section was sorted by "Conn #" in decreasing order to force the large-numbered internal connections to the top.
- The "CPU %" values in the bottom section are wildly exaggerated (they add up to more than 100%) even though Foxhound has adjusted the numbers to account for the fact 4 CPUs were in use. The next version of Foxhound will do a better job at showing believable numbers (the problem lies with SQL Anywhere) . . . but it doesn't matter here . . . what matters is these four connections were sucking all the air out of the room.
- Experience showed that intra-query parallelism was the likely reason these four connections exist, which meant that somewhere among the other connections there was a parent responsible for all the activity... the fact that the number of internal connections always exactly matched the number of CPUs was a helpful clue.
- Not shown in the image above is the fact that finding the parent connection among 57 others was virtually impossible because the "CPU %" column offered no clue; the peaks only showed up for the internal (child) connections, not the parent that was
living off child labormanaging the work effort.
- Also not shown in the image above is the fact that no "Last Statement" values showed up in the bottom section because "RememberLastStatement" was not turned on, so even if the parent connection could have been determined there was no way to tell what it was doing.
would make make it all better.
SET OPTION PUBLIC.MAX_QUERY_TASKS = '1';
And verily forsooth, such was the case!
That was the good news (for the client)... the bad news (for the consultant) was that no more work was required. Perhaps further study would have revealed a Quey From Hell or two, and even more billable hours that could be spent on query tuning, but that would have violated The First Rule Of Program Optimization: Don't do it.
Or in other words, if it's running OK, leave it alone.
Or, if it ain't broke, don't fix it.
Not something you would ever hear from Marty Kaan :)