Wednesday, August 22, 2012

Grimm Tales From The Doc Face

Ever since the Technical Documents page grew in size to include 1,127 links to other pages, the Google Custom Search Engine has performed very poorly.

For example, try searching for ROW_NUMBER or ASSERTION on the Technical Documents page: Google won't return nearly as many pages as exist... it won't even return some pages with those words in the title.

Sooooo... after waiting and waiting and waiting for the Google CSE to catch up, a different approach is being tried:

  • A standalone page has been created, separate from this blog, holding just the document links and nothing else,

  • a new Google Custom Search Engine has been created using one single URL (the standalone page) rather than 1,127 separate document URLs,

  • with the following Google "My search engines" settings
    Tales From The Doc Face
       Control Panel
             Included sites
                Add sites
                   Include sites individually
                      What to include: 
                         Dynamically extract links from this page and add them to my search engine
                            Include all pages this page links to
That last point is important: if "Include all pages this page links to" implies "and only those pages" then maybe the new approach will work. The last thing anyone wants is for Google to include all the other stuff those pages link to; if you want to see all of, or the whole internet, you can use Google Classic.

Good luck, everyone!

When the new Technical Documents page starts to behave better than the old Technical Documents page, as far as searching is concerned, it will replace the old page.

1 comment:

Anonymous said...

thanks for sharing.