Page MenuHomePhabricator

Concealing search results that are listed in robots.txt
Closed, DeclinedPublic

Description

Author: brian.mcneil

Description:
The objective is to avoid having pages which are within a namespace that is searched by default and fall into one of the listed directories in robots.txt are, by default, suppressed in the returned results.

Ideally this would be optionally accessible to logged in users by giving a link at the head or foot of the results pages when there are hidden results, possibly "Show hidden results" followed by a "Why hidden?" link to explain what is going on.

This has been discussed on foundation-l and, for English Wikinews, would suppress the display of prepared obituaries (Eg go to http://en.wikinews.org and search for "Carter")


Version: unspecified
Severity: enhancement

Details

Reference
bz13439

Related Objects

StatusSubtypeAssignedTask
OpenNone
DeclinedNone

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 10:06 PM
bzimport set Reference to bz13439.
bzimport added a subscriber: Unknown Object (MLST).

Note that an existing way to handle that case would be to segregate such "prepared stories" in a namespace which is not searched by default.

ayg wrote:

This is not reasonably possible as stated. The wiki software does not know whether a robots.txt exists or where it might be located, let alone what's in it. I would be inclined to say WONTFIX on the basis that any fix for the problem as stated would be a horrible and fragile hack that could be much better implemented in other ways. I would be doubly inclined to say so on the basis that robots.txt is not meant to hide things from everyone, only from robots, and robots don't read or index search pages anyway, so the relevance seems to be near-zero.

In particular, it seems like the best solution to your quandary is just to use a namespace that's not searched by default.