Monday, January 21, 2019

Replace SERVER CLASS 'DIRECTORY' with sp_list_directory()

Question: How do I fix these unexplained intermittent errors when using CREATE SERVER CLASS 'DIRECTORY' to get a list of the file names in a folder?

  • From a SELECT statement even though the CREATE EXISTING TABLE works OK:
    SQLCODE = -141, SQLSTATE = 42W33, ERRORMSG() = Table '...' not found
    

  • From a CREATE EXISTING TABLE statement even though the CREATE EXTERNLOGIN is correct:
    SQLCODE = -656, SQLSTATE = HY000, ERRORMSG() = Unable to connect to server '...': DirectoryAccess servers require an externlogin
    
Answer: Switch to using sp_list_directory(); here's how:
  • Set the max_depth parameter to 1 in the sp_list_directory() call if you are using SUBDIRS=0 in the CREATE SERVER statement.

  • Use SELECT TOP 1 and sa_split_list() to get the file name out of the file_path column returned by sp_list_directory().

  • Use LATERAL to join the sp_list_directory() and sa_split_list() calls.

By Example

After several years of intermittent errors, the SERVER CLASS 'DIRECTORY' method was recently judged too fragile for further use in Foxhound. Here is the before-and-after code for the four sections that were changed:

Example 1 BEFORE: SERVER CLASS 'DIRECTORY'
CREATE SERVER local_folder CLASS 'DIRECTORY' USING 'ROOT={@foxhound_folder};SUBDIRS=0';

CREATE EXTERNLOGIN DBA TO local_folder;

CREATE EXISTING TABLE local_file AT 'local_folder;;;.';

SELECT TOP 1 
       local_file.size,
       local_file.file_name
  INTO @size,
       @file_name
  FROM local_file 
 WHERE local_file.file_name LIKE @patch_file_name_like_pattern
   AND NOT EXISTS ( SELECT *
                      FROM rroad_applied_patch
                     WHERE rroad_applied_patch.file_name = local_file.file_name )
 ORDER BY local_file.file_name ASC; 
Example 1 AFTER: sp_list_directory()
SELECT TOP 1 
       sp_list_directory.file_size    AS size,
       local_file.file_name           AS file_name
  INTO @size,
       @file_name
  FROM sp_list_directory ( @foxhound_folder, 1 ),
       LATERAL ( SELECT TOP 1 
                        sa_split_list.row_value AS file_name
                   FROM sa_split_list ( sp_list_directory.file_path, '\\' )
                  ORDER BY sa_split_list.line_num DESC 
               ) AS local_file 
 WHERE sp_list_directory.file_type = 'F'
   AND local_file.file_name        LIKE @patch_file_name_like_pattern
   AND NOT EXISTS ( SELECT *
                      FROM rroad_applied_patch
                     WHERE rroad_applied_patch.file_name = local_file.file_name )
 ORDER BY local_file.file_name ASC;
Example 2 BEFORE: SERVER CLASS 'DIRECTORY'
CREATE SERVER dynamic_source_help_folder
   CLASS 'DIRECTORY' USING 'ROOT=C:/projects/foxhound5/help/dynamic_source;SUBDIRS=0';

CREATE EXTERNLOGIN DBA TO dynamic_source_help_folder;

CREATE EXISTING TABLE dynamic_source_help_folder AT 'dynamic_source_help_folder;;;.';

INSERT temp_help
SELECT dynamic_source_help_folder.file_name ,
       dbo.xp_read_file ( STRING ( 'C:/projects/foxhound5/help/dynamic_source/', dynamic_source_help_folder.file_name ) )
  FROM dynamic_source_help_folder 
 WHERE dynamic_source_help_folder.file_name LIKE '%.html'
 ORDER BY dynamic_source_help_folder.file_name;
Example 2 AFTER: sp_list_directory()
INSERT temp_help
SELECT dynamic_source_help_folder.file_name ,
       dbo.xp_read_file ( STRING ( 'C:/projects/foxhound5/help/dynamic_source/', dynamic_source_help_folder.file_name ) )
  FROM sp_list_directory ( 'C:/projects/foxhound5/help/dynamic_source', 1 ),
       LATERAL ( SELECT TOP 1 
                        sa_split_list.row_value AS file_name
                   FROM sa_split_list ( sp_list_directory.file_path, '\\' )
                  ORDER BY sa_split_list.line_num DESC 
              ) AS dynamic_source_help_folder  
 WHERE sp_list_directory.file_type          = 'F'
   AND dynamic_source_help_folder.file_name LIKE '%.html'
 ORDER BY dynamic_source_help_folder.file_name;
Example 3 BEFORE: SERVER CLASS 'DIRECTORY'
CREATE SERVER backup_folder CLASS 'DIRECTORY' USING 'ROOT={@backup_folder};SUBDIRS=0';
      
CREATE EXTERNLOGIN DBA_activation TO backup_folder;
      
CREATE EXISTING TABLE backup_file AT 'backup_folder;;;.';

INSERT backup_file_drop_down_list_entry
SELECT ROW_NUMBER() OVER ( ORDER BY backup_file.file_name DESC )    AS row_number,
       backup_file.file_name                                         AS file_name
  FROM backup_file 
 WHERE backup_file.file_name LIKE STRING ( 'foxhound5-', @build_number, '-mini-backup-%.txt' )
 ORDER BY backup_file.file_name DESC;
Example 3 AFTER: sp_list_directory()
INSERT backup_file_drop_down_list_entry
SELECT ROW_NUMBER() OVER ( ORDER BY file_name.file_name DESC )    AS row_number,
       file_name.file_name                                         AS file_name
  FROM sp_list_directory ( @backup_folder, 1 ),
       LATERAL ( SELECT TOP 1 
                        sa_split_list.row_value AS file_name
                   FROM sa_split_list ( sp_list_directory.file_path, '\\' )
                  ORDER BY sa_split_list.line_num DESC 
               ) AS file_name  
 WHERE sp_list_directory.file_type = 'F'
   AND file_name.file_name         LIKE STRING ( 'foxhound5-', @build_number, '-mini-backup-%.txt' )
 ORDER BY file_name.file_name DESC;
Example 4 BEFORE: SERVER CLASS 'DIRECTORY'
CREATE SERVER backup_folder CLASS 'DIRECTORY' USING 'ROOT={@backup_folder};SUBDIRS=0';

CREATE EXTERNLOGIN DBA TO backup_folder;

CREATE EXISTING TABLE backup_file AT 'backup_folder;;;.';

FOR f_backup_file AS c_backup_file INSENSITIVE CURSOR FOR
SELECT TOP ALL START AT ( @mini_backup_retention_count + 1 )
       backup_file.file_name AS @file_name
  FROM backup_file
 WHERE backup_file.file_name LIKE STRING ( 'foxhound5-', @build_number, '-mini-backup-%.txt' )
 ORDER BY backup_file.file_name DESC
FOR UPDATE
DO
Example 4 AFTER: sp_list_directory()
FOR f_backup_file AS c_backup_file INSENSITIVE CURSOR FOR
SELECT TOP ALL START AT ( @mini_backup_retention_count + 1 )
       backup_file.file_name AS @file_name
  FROM sp_list_directory ( @backup_folder, 1 ),
       LATERAL ( SELECT TOP 1 
                        sa_split_list.row_value AS file_name
                   FROM sa_split_list ( sp_list_directory.file_path, '\\' )
                  ORDER BY sa_split_list.line_num DESC 
              ) AS backup_file 
 WHERE sp_list_directory.file_type = 'F'
   AND backup_file.file_name       LIKE STRING ( 'foxhound5-', @build_number, '-mini-backup-%.txt' )
 ORDER BY backup_file.file_name DESC
FOR UPDATE
DO


Monday, December 31, 2018

WITH FILE NAME LOGGING runs faster, but...

The LOAD TABLE statement has been a feature of SQL Anywhere since the early days, and the Version 5.5 Help was enthusiastic about it: "LOAD TABLE is more efficient than the ISQL statement INPUT and can be called from any client application."

The official enthusiasm had waned a bit by Version 6: "LOAD TABLE is intended solely for fast loading of large amounts of data. It is not intended for routine use in applications."

The Dark Side of LOAD TABLE was introduced in Version 7: "Inserts are not recorded in the log file. Thus, the inserted rows may not be recovered in the event of a crash. In addition, the LOAD TABLE statement should never be used in database involved in SQL Remote replication or databases used as MobiLink clients because these technologies replicated changes through analysis of the log file.

Versions 8, 9 and 10 were positively shrill on the subject: "Inserts are not recorded in the log file, raising the risk that data will be lost in the event of a crash and making this statement unusable with SQL Remote or with MobiLink remote databases."

Version 11, however, dropped the scary rhetoric and introduced three new LOAD TABLE clauses: WITH FILE NAME LOGGING, WITH ROW LOGGING and WITH CONTENT LOGGING.

WITH FILE NAME LOGGING is the default in Versions 11 through 17, and it is the name given to the way LOAD TABLE worked in all the earlier versions:

  • It records the file specification of the input file in the transaction log file, not the individual row INSERT statements.

  • It is also the fast method because it doesn't have to write millions of records to the log, and

  • it makes the log file much smaller.
The bad news is that
  • SQL Remote stops working if you use LOAD TABLE WITH FILE NAME LOGGING on any table involved in replication,

  • MobiLink stops working if you do the same thing to any remote table involved in synchronization, and

  • any database restart recovery that needs to apply transaction log data must have access to the original input file in its original location.
If you don't use SQL Remote or MobiLink then you don't care the first two problems, and if you do use them you already know what the problems with LOAD TABLE are.

The third problem (LOAD TABLE versus restart recovery) is less well understood.

For example, here's what an ordinary LOAD TABLE FROM 'file' statement
LOAD TABLE ttt FROM 'C:\\DATA\\TEST\\ttt.TXT';
looks like in the SQL Anywhere 17 transaction log:
--CHECKPOINT-0000-05896335817-2019-01-04 08:32:43.559
--CONNECT-1004-05896335806-dba-2019-01-04 08:32
--ROLLBACK-1004-05896335851
ROLLBACK WORK
go
--CHECKPOINT-0000-05896335867-2019-01-04 08:38:26.067
--CONNECT-1007-05896335856-dba-2019-01-04 08:38
--BEGIN TRANSACTION-1007-05896335901
BEGIN TRANSACTION
go
--BEGIN LOAD TABLE-1007-05896335904: load into table "dba"."ttt" using file 'C:\\TEMP\\ttt.TXT'
--END LOAD TABLE-1007-05896335966
--ROLLBACK-1007-05896335969
ROLLBACK WORK
go
--CHECKPOINT-0000-05896335972-2019-01-04 08:39:05.645
--BEGIN TRANSACTION-1007-05896336006
BEGIN TRANSACTION
go
--BEGIN LOAD TABLE-1007-05896336009: load into table "dba"."ttt" using file 'C:\\DATA\\TEST\\ttt.TXT'
--SQL-1007-05896336077
begin
  set temporary option "conversion_error" = 'On';
  set temporary option "string_rtruncation" = 'On';
  set temporary option "date_order" = 'YMD';
  set temporary option "nearest_century" = '50';
  set temporary option "default_timestamp_increment" = '1';
  set temporary option "time_zone_adjustment" = '-300';
  load into table "dba"."ttt" using file 'C:\\DATA\\TEST\\ttt.TXT' encoding 'windows-1252';
  set temporary option "conversion_error" = ;
  set temporary option "string_rtruncation" = ;
  set temporary option "date_order" = ;
  set temporary option "nearest_century" = ;
  set temporary option "default_timestamp_increment" = ;
  set temporary option "time_zone_adjustment" = ;
end
go
--END LOAD TABLE-1007-05896336793
--COMMIT-1007-05896336796-2019-01-04 08:41:03.742
COMMIT WORK
go
--START CHECKPOINT-0000-05896336807-2019-01-04 08:41:05.581
--FINISH CHECKPOINT-0000-05896336832-2019-01-04 08:41:26.851
--ROLLBACK-1007-05896336848
ROLLBACK WORK
go
--CHECKPOINT-0000-05896336853-2019-01-04 08:45:55.716
Let's say you do these four things:
  1. You make a full backup of your database,

  2. you run the LOAD TABLE ttt FROM 'C:\\DATA\\TEST\\ttt.TXT' as shown above,

  3. you back up the log file, and then

  4. you delete the data file C:\DATA\TEST\ttt.TXT.
Then, disaster strikes: You lose your current database.

Here's how you might try to recover your database:
  1. You copy your backup database file to replace the current database, and then

  2. you run dbsrv17 -ad to apply the backup log file to the restored database file.
Sadly, that Cannot access file 'C:\DATA\TEST\ttt.TXT':
I. 01/04 09:18:32. Starting database "ddd17" (C:\$ web\blog\20181231 WITH FILE NAME LOGGING runs faster, 
   but\ddd17.db) at Fri Jan 04 2019 09:18
I. 01/04 09:18:32. Performance warning: Database file "C:\$ web\blog\20181231 WITH FILE NAME LOGGING runs faster, 
   but\ddd17.db" consists of 213 disk fragments
I. 01/04 09:18:32. Database recovery in progress
I. 01/04 09:18:32.     Last checkpoint at Fri Jan 04 2019 08:29
I. 01/04 09:18:32.     Checkpoint log...
I. 01/04 09:18:33.     Transaction log: C:\$ web\blog\20181231 WITH FILE NAME LOGGING runs faster, but\190104AB.LOG...
I. 01/04 09:18:33. Starting checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 09:18
I. 01/04 09:18:33. Finished checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 09:18
I. 01/04 09:18:33. Starting checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 09:18
I. 01/04 09:18:33. Finished checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 09:18
I. 01/04 09:18:33. Starting checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 09:18
I. 01/04 09:18:33. Finished checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 09:18
I. 01/04 09:18:33.     Transaction log: C:\$ web\blog\20181231 WITH FILE NAME LOGGING runs faster, but\ddd17.log...
I. 01/04 09:18:33. Starting checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 09:18
I. 01/04 09:18:34. Finished checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 09:18
I. 01/04 09:18:34. Starting checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 09:18
I. 01/04 09:18:34. Finished checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 09:18
I. 01/04 09:18:34.     Parallel recovery enabled
I. 01/04 09:18:34. Starting checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 09:18
I. 01/04 09:18:34. Finished checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 09:18
E. 01/04 09:18:34. *** ERROR *** Assertion failed: 100948 (17.0.9.4882)
Failed to redo a database operation (id=4, page_no=0x1, offset=0x128) - Error: Cannot access file 'C:\DATA\TEST\ttt.TXT' 
   -- The system cannot find the file specified.
Here are three alternatives to avoid that problem:
  1. Take a full backup right after each LOAD TABLE WITH FILE NAME LOGGING statement,

  2. preserve all the input files until they are no longer needed for recovery... which is after you eventually do take a full backup, or

  3. use LOAD TABLE WITH ROW LOGGING instead (or WITH CONTENT LOGGING).
Alternative 3's probably the easiest if you have a lot of LOAD TABLE files and they're not very big; just run with ROW LOGGING and throw the files away after they're loaded.

If you regularly load millions of rows, however, you might need the speed of Alternatives 1 and 2. Here's how FILE NAME LOGGING and ROW LOGGING compare:
              LOAD TABLE              LOAD TABLE 
              WITH FILE NAME LOGGING  WITH ROW LOGGING
              ======================  ================ 
Elapsed time  118 seconds             214 seconds      - much slower
Log growth    0 pages                 481,295 pages    - huge growth
Here are the test scripts:

LOAD TABLE ttt FROM 'C:\\DATA\\TEST\\ttt.TXT';
BEGIN
DECLARE @start_at TIMESTAMP;
DECLARE @seconds  BIGINT;
DECLARE @pages    BIGINT;
DECLARE @growth   BIGINT;

-- Test A: LOAD TABLE

SET @start_at = CURRENT TIMESTAMP;
SET @pages    = DB_EXTENDED_PROPERTY ( 'FileSize', 'translog' );
MESSAGE STRING ( CURRENT TIMESTAMP, ' Test A starting; @@VERSION = ', @@VERSION ) TO CONSOLE;
LOAD TABLE ttt FROM 'C:\\DATA\\TEST\\ttt.TXT';
SET @seconds = DATEDIFF ( SECOND, @start_at, CURRENT TIMESTAMP );
SET @growth  = DB_EXTENDED_PROPERTY ( 'FileSize', 'translog' ) - @pages;
MESSAGE STRING ( CURRENT TIMESTAMP, ' Test A seconds = ', @seconds, ', log pages growth = ', @growth ) TO CONSOLE;
END;

I. 01/04 08:38:26. 2019-01-04 08:38:26.035 Test A starting; @@VERSION = 17.0.9.4882
I. 01/04 08:38:26. Starting checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 08:38
I. 01/04 08:38:26. Finished checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 08:38
I. 01/04 08:39:05. 2019-01-04 08:39:05.598 Test A starting; @@VERSION = 17.0.9.4882
I. 01/04 08:39:05. Starting checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 08:39
I. 01/04 08:39:05. Finished checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 08:39
I. 01/04 08:41:03. 2019-01-04 08:41:03.742 Test A seconds = 118, log pages growth = 0
I. 01/04 08:41:05. Starting checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 08:41
I. 01/04 08:41:05. Cache size adjusted to 2270460K
I. 01/04 08:41:10. Cache size adjusted to 2256544K
I. 01/04 08:41:26. Finished checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 08:41

LOAD TABLE ttt FROM 'C:\\DATA\\TEST\\ttt.TXT' WITH ROW LOGGING;
BEGIN
DECLARE @start_at TIMESTAMP;
DECLARE @seconds  BIGINT;
DECLARE @pages    BIGINT;
DECLARE @growth   BIGINT;

-- Test B: LOAD TABLE WITH ROW LOGGING

SET @start_at = CURRENT TIMESTAMP;
SET @pages    = DB_EXTENDED_PROPERTY ( 'FileSize', 'translog' );
MESSAGE STRING ( CURRENT TIMESTAMP, ' Test B starting; @@VERSION = ', @@VERSION ) TO CONSOLE;
LOAD TABLE ttt FROM 'C:\\DATA\\TEST\\ttt.TXT' WITH ROW LOGGING;
SET @seconds = DATEDIFF ( SECOND, @start_at, CURRENT TIMESTAMP );
SET @growth  = DB_EXTENDED_PROPERTY ( 'FileSize', 'translog' ) - @pages;
MESSAGE STRING ( CURRENT TIMESTAMP, ' Test B seconds = ', @seconds, ', log pages growth = ', @growth ) TO CONSOLE;
END;

I. 01/04 09:41:29. 2019-01-04 09:41:29.472 Test B starting; @@VERSION = 17.0.9.4882
I. 01/04 09:41:29. Starting checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 09:41
I. 01/04 09:41:29. Finished checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 09:41
I. 01/04 09:45:03. Starting checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 09:45
I. 01/04 09:45:03. 2019-01-04 09:45:03.724 Test B seconds = 214, log pages growth = 481295
I. 01/04 09:45:04. Cache size adjusted to 2267588K
I. 01/04 09:45:19. Finished checkpoint of "ddd17" (ddd17.db) at Fri Jan 04 2019 09:45
I. 01/04 09:46:04. Cache size adjusted to 2255732K



Wednesday, November 14, 2018

Foxhound Consulting Done Remotely

SQL Anywhere performance and tuning consulting by a real person (me) is now available remotely, via email. Here's how it works:

  1. You run the Foxhound 4 Database Monitor to gather samples from your SQL Anywhere database,

  2. then you send me a copy (or a subset) of your Foxhound database

  3. which I look at for one hour

  4. and then I send you my observations and suggestions.



Yes, the first hour of consulting is free... and you can send me your entire foxhound4.db file if it fits in an email or dropbox.

But . . . if your Foxhound database is huge . . .


If your foxhound4.db file is too big to send by email, you can use Foxhound's Adhoc Query feature to unload a smaller subset which will fit in an email:
  1. Download this SQL script to make this process easier: unload_samples.txt

  2. and save it in a folder like C:\ProgramData\RisingRoad\Foxhound4.

  3. Modify Foxhound to allow the script to run. To do this, add the dbsrv -gl all option to one of the Foxhound shortcut command files.

    For example, here's how to add -gl all to the $start_foxhound4_default_browser.bat file in C:\ProgramData\RisingRoad\Foxhound4:
    ...
    "!SDIR!\!BIN!\dbspawn.exe"^
      -f "!SDIR!\!BIN!\!SPGM!"^
      -c 25p^
      -ch 50p^
      -cr-^
      -gk all^
    -gl all^
    -gn 220^ -gna 0^ -n foxhound4^ -o foxhound4_debug.txt^ -oe foxhound4_debug_startup.txt^ -on 1M^ -sb 0^ -ufd restart^ -x tcpip^ -xd^ -xs http(port=8080;maxsize=0;to=600;kto=600)^ foxhound4.db^ -n f ...

  4. Run: Foxhound4 - Stop Foxhound Engine

  5. Run: Foxhound4 - Start Foxhound via default browser to put -gl all into effect.

  6. Run: Foxhound4 - Adhoc Query Foxhound Database via ISQL

  7. Run the unload_samples.txt script to gather a subset.

    For example, the following statements unload all the samples recorded since May 1 and write the files into C:\temp:
    READ unload_samples.txt;
    CALL unload_samples ( '2018-05-01' ); -- for more examples look inside unload_samples.txt
    

  8. Attach the files from C:\temp to a covering email and send it to me: Breck.Carter@gmail.com

That's it! ...you'll hear back from me soon.


Monday, October 22, 2018

Let's Be Blunt

SQL Anywhere really does do things the way they should be done. That's why the title of this blog has changed.

Another thing that's changed is the row of page links at the top: "Community" is gone because ( in order to serve you better? :) the "SAP SQL Anywhere Community" has been deleted.

The Forum link is still there. That's where the real community has always been, and remains.

Two SQL Anywhere Help links remain: DCX because that's the only way to see Version 16 and earlier, and the SAP Help Portal because it works fairly well and is a lot faster than DCX.

Plus, there are two Foxhound links: one to the Help where all the "Performance Tips" live, and one to the White Paper that shows off some of the features.

The rest are gone because they were just clutter: use the Portal search instead.