Page MenuHomePhabricator

Create MediaWiki:Robots.txt
Closed, DeclinedPublic

Description

There should be robots.txt regularly editable message so setting of it won't be depending on developers, but admins could edit it anytime needed.

.htaccess could then have line like
RewriteRule robots.txt /w/index.php?title=MediaWiki:Robots.txt&action=raw&ctype=text/plain


Version: unspecified
Severity: enhancement

Details

Reference
bz13249

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 10:07 PM
bzimport set Reference to bz13249.
bzimport added a subscriber: Unknown Object (MLST).

Firstly, if you're already doing something as ugly as rewriting to the Message in the way you talk about, then there is no point to adding it to the interface because all you need to do is paste the text of the current robots.txt there, nothing inside of MediaWiki is needed for that.

Secondly, text/plain is not a &ctype= which is valid, &ctype= only uses text/x-wiki, text/css, and text/javascript, so that will actually be served with text/x-wiki, not text/plain and that may cause issues with some browsers.

robots.txt is pretty low-level and honestly I'd just not be comfortable with leaving things open to this degree.

There has been some discussion now and again of adding a relatively easy way to set the meta robots tags for individual pages; perhaps as a special form of protection. Offhand I can't find relevant bugzilla entries to point this one to, but probably they're in there somewhere. :)

We'd be much more likely to go ahead and implement that sort of thing than a raw robots.txt editor.

wiki.bugzilla wrote:

(In reply to comment #2)

see bug 8068 and bug 9415, and, as an example request: bug 10648

(In reply to comment #2)

We'd be much more likely to go ahead and implement that sort of thing than a
raw robots.txt editor.

OK, so I tried to change the summary for something expressing the issue more generally.

Maybe NOINDEX magic word could work then?

Actually, while Rob Church says that there should be no reason for a User to make his userpage not index, on a publicly viewable wiki.

IMHO, having a NOINDEX which would perhaps even add to a generated [[Special:Robots.txt]] which someone could use a quick rewrite to (It would work in either mod_rewrite, or even alias) would be an even nicer way of keeping Wikimedia's RfA and AfD pages out of search engines than going and having a bunch of Bugzilla requests for adding them to the core robots.txt.

wiki.bugzilla wrote:

(In reply to comment #4)

OK, so I tried to change the summary for something expressing the issue more
generally.

Please note the links in comment #3 and see bug 9415 for such a request.
(please do not duplicate bugs; I changed back the summary therefore)

Maybe NOINDEX magic word could work then?

Again, please see comment #3 and bug 8068 for that suggestion.

Reopened bug 8068 for further consideration. Resolving this one WONTFIX for the original issue.

This was resolved on Wikimedia wikis by bug 15601 (cf. [[wikitech:robots.txt]]).