If you’re struggling to establish a bona fide presence on the internet (which I freely admit I am) then reading this article will probably make you feel very annoyed.
And sadly I think it might be coming true.
Several years ago, when Google was smaller and I had virtually no visitors, I could still count on having my entire site indexed. That was exactly why I spent so much time telling friends about the amazing search engine which seemed to be able to store a copy of every page on the internet!
But now I find that maybe half my blog entries are not properly indexed, and when something isn’t indexed the chances of someone finding it drops to zero. I don’t mind low ranking, but not being indexed at all…? That really pisses me off.
Hardly anyone links directly to my articles— I don’t know why that is exactly but as long as I’m searchable I don’t really mind. But when I’m not searchable I may as well not exist! Everyone knows that people find stuff online by searching, not browsing, and Google is the principal search tool for at least 50% of English speaking users.
I hate the idea that I could be searching for a specific set of terms, and Google might overlook a page (which may have been accessible for years) which may be the only match. Part of the point of a search engine is to expose information which might not be found by other means.
Details
A few months ago the problem was that Google didn’t even seem to bother crawling my whole site, but I think a bit of extra direction via META tags helped direct it to skip monthly archives and index individual entries (since they don’t store in-page anchor points, searches that arrive on large archive pages seem pretty pointless).
Now Google claims to have read every entry (ie if I pick a random entry and check their cache, there will be a copy) but there seems to be divergence in what they cache and what they index… eg this entry features the fairly distinctive phrase "third world police robot" and yet Google returns on that term (ironically, simply writing this here will probably rectify that particular case).
Other examples as of this writing:
"Telescope Discovers Message From God"
"always wanted a vectrex"
"wrangle into shape" software
"incident of the shifting shirt"
"renowned curator" "Once a guy stood all day shaking bugs from his hair"
It’s not like this is all fantastic content, but these are my words and this is the only place you’ll find them— only you won’t find them because they’re not being indexed properly.
__________
* UPDATE June 20, 2006: It looks like I picked a bad time to do my testing— All the phrases above are now properly indexed, and out of ten other posts chosen at random only two appeared to be un-findable. Also, it’s not just my site; when I did the tests two days ago I got no results for any of the search terms above, whereas I now get multiple for "always wanted a vectrex"