Google Forgets the Old Web
Written by
Categories: Modern, Not an expert

A senior Google Search staffer recently claimed that Google does not downgrade older pages in search results. For the record, it was Tim Bray in 2018 who demonstrated that Google was not returning the only page with a string if that page was more than 5 to 10 years old. He could find that same page with DuckDuckGo. As he put it, Google is losing its memory. He documented the same problem again in February 2022. A Marco Fioretti also found that Google was refusing to return some old pages in 2018.

Google’s amnesia, Alzheimer’s, memory loss, or forgetting of old pages is one of the many reasons why I have not had much to do with them since 2013. They are just not good partners in the kind of web that I want to be part of. Google has become less and less transparent about how their search works, despite maintaining a Google Search Central, so people just have to speculate whether (for example) changes in response to the 2016 US election hurt independent sites. The king of thing which Tim Bray documented is not ambiguous, if there is precisely one page with a string then a search engine should return that page.

PS. I am not amused that I have to create a post like this on my own site because search engines are poor at digging up the original. DuckDuckGo prefers a Hacker News discussion of the blog post by Fioretti to the actual blog posts which Fioretti discusses.

PPS. As far as I know, Google, Yandex, and Bing are still the only publicly available attempts to index the whole web (‘public’ in the sense that anyone can make a query based on their index). And something like 90% of web searches are powered by Google. So the decisions they make are very important.

Edit 2024-02-03: Google has shut down its Cache, another tool for accessing sites which have temporarily or permanently gone down or been hacked

(scheduled 3 January 2024)

