A Google Bug?
Posted February 22nd, 2005 @ 12:17am by Erik J. Barzeski
Tonight I spent an hour redoing links on The Sand Trap because Google (and MSN, and Yahoo) seem to have a bug: they don't recognize the ".golf" extension.
The search results you do get (currently) on MSN and Yahoo are either from the forum (.php) or from category archives (i.e. "http://thesandtrap.com/archive/pga/") where the index file is pre-determined.
After all, the ".golf" extension is a bit of a ruse, because the underlying files were just PHP/HTML. The site now uses .php extensions, and I had to:
- Find and replace
.golf"
with.php"
(for links) - Download the
mt_tbping
table, replace ".golf" with ".php," and restore the table. - Reset MovableType to use .php everywhere instead of .golf.
- Craft a mod_rewrite rule to handle old-style links and direct them to the new ones.
- Rebuild the entire site a few times.
Now, perhaps, The Sand Trap can be listed in Google, MSN, and Yahoo.
Posted 22 Feb 2005 at 12:23am #
I always wondered why you were using .golf. While you're at it, you should change Freshly Squeezed Software over from .fss 🙂
Posted 22 Feb 2005 at 12:27am #
The .fss extension works fine for some reason, as evidenced by this search (and others).
Posted 22 Feb 2005 at 12:50am #
Perhaps it's because it needs to be a three-letter extension, a restriction imposed on Windows (at least I think extensions can only be three letters long on Windows -- I don't regularly use it). Maybe Google is simply overlooking the fact that the filename extension might be longer?
-- Simone
Posted 22 Feb 2005 at 12:53am #
Simone, like ".html"? 😛
Posted 22 Feb 2005 at 7:46am #
"Nobody", the .golf examples you found were arguments not extensions to files.
http://www.thesandtrap.com actually does have an extension which is http://www.thesandtrap.com/index.php or index.golf as it was.
It is kind of bizarre that it is not picking it up though, I always found the extension neat.
Which begs the question, why use .golf in the first place? Was it simply to make the website more unique?
Posted 22 Feb 2005 at 8:19am #
Yes, Mat, just to make it more unique. No other reason.
The comment from "nobody" was removed because he didn't follow the "leave a real name and email address or risk deletion" rule.
Posted 22 Feb 2005 at 9:39am #
Hmm -- I've used custom extensions before (up to 6 characters), with no ill effects on Google rankings. In my case, the files with the custom extensions were Smarty templates, which were then caught by an Apache AddHandler directive and directed to a PHP parser.
A couple of possibles spring to mind: firstly, if your customised pages weren't identifying themselves correctly in the HTTP Content-Type header, Googlebot may have fallen back on the extension, not recognised it and therefore decided that it couldn't index it. Alternatively, because 'golf' is a dictionary word rather than, say, 'fss', maybe the bot thinks you're trying to force more relevance onto that keyword and penalise you as a result?
Admittedly, I have no idea whether either of those would (or could) be the case...
Posted 22 Feb 2005 at 10:41am #
The content type was always fine. <meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />
Posted 22 Feb 2005 at 11:41am #
I was thinking more at the HTTP level -- i.e., rather than including a META element that says, 'treat this page as if you'd received a Content-Type: header of this value', actually ensuring that the server includes the correct MIME type in its initial response.
I'd imagine that Googlebot's coding would have been smart enough to use the contents of the http-equiv element, though, so you're probably right on that score...
Posted 22 Feb 2005 at 11:49am #
The HTTP response was also fine, Scott.
Posted 22 Feb 2005 at 3:29pm #
Why don't you use MultiViews (put "Options +MultiViews" in .htaccess or Apache configuration), to make extensions unnecessary altogether? Your files are still stored with a recognizable extension in the filesystem, but the server will accept requests for files where the extension is left out. The client, of course, never actually needs an extension -- it knows the Content-Type, which it'll still receive, and that's all it cares about.
I use MultiViews on most of my sites, and I've found that Google recognizes extensionless files with no problem. And in any case, I think it looks nicer, even more than a custom extension does.
Posted 22 Feb 2005 at 3:33pm #
Adam, short answer? Because I see no point whatsoever in not having extensions.
Posted 23 Feb 2005 at 3:42am #
Cool URIs don't change (nor have extensions) 🙂
Posted 23 Nov 2005 at 1:04pm #
Google, for whatever reason, continues to ignore The Sand Trap. The site used to have .golf extensions (php [i.e. html] files served with the proper text/html MIME type and content-types), but it's had .php extensions for about two weeks now....