The Google Freshness Factor
If you've ever used the removal tool, you'll notice that the page
count of the website in question has not been changed, but rather the
pages simply fail to show up. Why is this? Because with the URL Removal
Tool, these URLs have not been deleted; they have only been filtered
out. So even though these pages appear to have been removed, they are
certainly still in the database somewhere.
There is a patent application in the US Trademark Office from Monika
Henzinger, published in July, 2005, that certifies that she has figured
out a way of determining a document's "freshness." In an attempt to
associate this new term with Google's other patented terminology
(namely PageRank and TrustRank), forum posters are now referring to the
concept as "FreshRank."
The abstract of this patent application states that one of the problems
of determining the freshness of a document indexed in a search engine
is that the "last-modified-since" attribute isn't always correct. Some
webmasters have figured out they can change the modify date, but
obviously a pattern of abuse developed. It doesn't fool Google, because
what Google looks for is actual modified content. As far as how Google
determines how old or "fresh" a document may be is still somewhat of a
secret. Lately, in the estimation of many, Google has done a very poor
job in determining which web sites present as the freshest content in
relation to relevancy.
This brings to mind a pertinent question. How does the freshness factor
rank in determining relevancy? It has been determined by some that it
doesn't necessarily matter how fresh a document is to Google,
especially if that document has many inbound links pointing to it.
Henzinger is attempting to patent a more explicit form of freshness,
since not all search engines use the "last modified since" attribute
anyways, and stating that search engines need a more reliable way of
determining overall updated content.
Unfortunately, with the implementation of the duplicate content
penalty, we've been seeing problems with the freshness attribute of
documents. With Google, in particular, the filter employed to whittle
out duplicate content doesn't appear to be taking into consideration
the actual origin of the content. For many, this is becoming a great
frustrating point. With the onslaught of the technological advances
that Google has placed into the public realm within the last decade, it
seems impractical and almost ridiculous that they would leave out the
very concept of being able to determine the source of the fresh
content. Yahoo and MSN do not appear to have this particular problem,
so why does Google?
Another particular problem recently presented to the freshness factor,
is Google's own Removal Tool. Experiences with this tool have been much
on the side of unpleasant, if at all useful. For some, the Google
removal tool has been often mentioned "as a cure against many
diseases." Diseases such as duplicate content or temporary redirects,
for example. While I have used it from time to time, I have done so
with a cautionary tone, and never used it on a commercial website;
rather only on my personal website or blog. Some of the side effects
observed are definitely worth mentioning here, and I know I'm not alone.
The period of time Google uses to remove these URLs from their index is
anywhere between three and six months. I say from three to six months,
even though the Google documentation tells us 180 days; in my personal
experience, it has been more like 90 days. Regardless of the period of
time, rest assured, they are actually still there. How do I know this?
Two reasons: one, I mentioned before that the number of pages are still
listed as the same amount before the pages were removed; two, after the
removal period, they show right back up in the index, as if they'd
never left.
About the Author
Steve Buchanan writes article on many topics including Article Submission Services, Yamaha Generators and Snow Blowers
|