Friday, August 19, 2005

Google scraper ban

I posted something similar to this on the WMW forum and then decided it belogs here on my blog. This was a thread about collateral damage from the 7-29 Google scraper ban.

Andilinks has a lot of original content, both on pages full of text written entirely by me and as commentary about various listed links. I was however also using exerpts and meta descriptions from the linked sites to describe them and must have passed some threshold for that. I was seeding with these descriptions and re-editing later. I certainly will admit that the AdSense revenue was very good and that did provide an incentive to produce more pages, especially high-traffic pages.

There are far too many pages on my site like this for me to change them legitimately in a timely fashion and using text-scrambling software would degrade the descriptions and make my site more like the scrapers themselves. But I may write a script to swap out certain synonyms wholesale.

I am disallowing googlebot from those pages for now (they still do well with Y and M but I fear these may have their own scraper crack-down soon as well). I am keeping my fingers crossed.

In this recent recovery process I accidentally used too many 301 redirects for the server load and probably confused slurp and msnbot too...

But I have almost four years work invested in the database and am working day and night to recover this. If you stop hearing from me you'll know I 'died trying.'