Author: jeh
Description:
Screen shot of Jim Carrey's search listing
This week in two separate incidents high profile articles Wikipedia articles were vandalized and reverted within minutes. During the short window of vandalism, Googlebot cached and displayed slanderous material for a full day at the top of the search results. Due to the large number of articles in Wikipedia, and high search rankings, this is an increasing problem with the potential to damage the lives of article subjects and embarrass Wikipedia.
See attached screen shot, and these discussions:
- http://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard#Another_unfortunate_Google_grab
- http://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard#Google_search_reveals_what_happens_if_vandalism_isn.27t_reverted_quickly...
- http://searchengineland.com/070516-164154.php
One resolution strategy is to use an allowable form of cloaking, called "content delivery." We could apply the semi-protection criteria (not semi-protection itself) to article history to determine that last version that was saved by "good" user. This version can be accessed with an additional URL parameter, such as ?version=lastgood. When a search engine bot, such as Googlebot shows up and identifies itself (through the user-agent field in the http request header), you program a conditional redirect via .htaccess to append "?version=lastgood" to the URL, thus serving a slightly older, but more reliable copy of the page. This would avoid further embarrassment to Wikipedia, and help prevent harm to subjects of articles.
Over at the Wikipedia Administrators' Notice board, it was suggested to file a bug report. If you need further help with this, feel free to contact me. I am a professional SEO and web developer who can donate services.
Version: unspecified
Severity: major
URL: http://en.wikipedia.org
Attached: