Page MenuHomePhabricator

Articles for deletion included in Google search results
Closed, InvalidPublic

Description

Reported in OTRS: https://ticket.wikimedia.org/otrs/index.pl?Action=AgentTicketZoom;TicketID=9741176

With T67760: site.editpage may have Parameter 'md5', our robots.txt should disallow search engines to index subpages of the articles for deletion. However, recently it seems, that the pages are indexed as:
https://en.wikipedia.org/wiki/Wikipedia%3AArticles_for_deletion%2FDetensor_Therapy

The robots.txt entry just disallows subpages with a slash (/) not with the urlencoded form.

See (example):
https://www.google.de/?gws_rd=ssl#q=inurl:Articles_for_Deletion%2F+site:en.wikipedia.org

pasted_file (1×1 px, 188 KB)

Event Timeline

Thanks for the report, Florian.

There's some discussion about this going on here: https://en.wikipedia.org/wiki/MediaWiki_talk:Robots.txt#Google_thinks_it.27s_cute.2C_we_need_to_blacklist_Wikipedia.253AArticles_for_deletion.252F

Given that robots.txt has typically been in the control of the local communities, I don't think there's anything for Discovery to do here.

Florian added a subscriber: Legoktm.

Oh damn, I was't aware of this feature :P Ok, as @Legoktm is already involved in the discussion and is working on a solution with the community, I think we shouldn't need a second place to discuss this. That's why I mark this as invalid :)