Author: info
Description:
- In Google, search en.wikipedia.org for 'privacy':
http://www.google.com/search?hl=en&q=site%3Aen.wikipedia.org%20privacy
Results:
153,000,000 results!
This is because "Privacy" is in the footer, so Google matches every page.
Expected:
Turn off indexing of common areas. I added some notes on how to do this to
http://www.mediawiki.org/wiki/How_best_to_search_or_spider_mediawiki_systems and
http://en.wikipedia.org/wiki/Robots_Exclusion_Standard#Directives_within_a_page ,
for Google the key is <!--googleoff: index--> ... <tt><!--googleon: index--> and
old spiders use <NOINDEX>.
You could counter-argue that if a word appears on a page and the user pastes it
into a search engine, then the engine MUST find that page. But I think the
value of eliminating all those search results outweighs this.
Version: unspecified
Severity: normal
URL: http://www.mediawiki.org/wiki/How_best_to_search_or_spider_mediawiki_systems