Tracking category for __NOINDEX__
Closed, ResolvedPublic

Description

Author: happy_melon

Description:
For transparency, pages using the NOINDEX and INDEX behavior switches should be auto-categorised into a tracking category a la [[Category:Hidden categories]] for HIDDENCAT. Ideally, this should only occur when the switch is actually having an *effect* - ie, only where the switch is allowed by $wgNamespaceRobotPolicies and $wgArticleRobotPolicies. This would achieve the double purpose of allowing users to see if the switch is having an effect, and allowing the use of the switches to be monitored.


Version: unspecified
Severity: minor

bzimport added a project: MediaWiki-Categories.Via ConduitNov 21 2014, 10:29 PM
bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz16979.
bzimport created this task.Via LegacyJan 11 2009, 10:25 PM
bzimport added a comment.Via ConduitJan 13 2009, 11:23 AM

jopiswezggzmw wrote:

Proposed patch

I've taken a stab at trying to do this. INDEX includes the page in [[Category:Indexed pages]] and NOINDEX includes the page in [[Category:Non-indexed pages]].

attachment Parser.patch ignored as obsolete

bzimport added a comment.Via ConduitJan 13 2009, 11:28 AM

jopiswezggzmw wrote:

Define category names

attachment enMessage.patch ignored as obsolete

bzimport added a comment.Via ConduitJan 13 2009, 11:47 AM

happy_melon wrote:

This categorises even if the action of INDEX__/NOINDEX__ is disabled by $wgExemptFromUserRobotsControl, doesn't it? From the fixmes in OutputPage.php, looks like the whole thing could do with an overhaul.

bzimport added a comment.Via ConduitJan 13 2009, 11:35 PM

jopiswezggzmw wrote:

Proposed patch

New patch checks if the namespace is ExemptFromUserRobotsControl.

If it is then Parser.php does not add the category or setIndexPolicy, which (I think) makes the check on OutputPage.php redundant.

attachment new.patch ignored as obsolete

bzimport added a comment.Via ConduitJan 14 2009, 1:21 AM

jopiswezggzmw wrote:

Factoring in ArticleRobotPolicies

I could kill two birds with one stone here.

It should now work like this;

  1. Check if the page has a policy defined in $wgArticleRobotPolicies - if it does not code will be executed so the page will not be added to the category and the new Index/Noindex policy will not be set.
  1. If not then check $wgExemptFromUserRobotsControl - if the namespace has a local policy then the policy will not be set.
  1. If not, check if NOINDEX/INDEX tags are in use
  1. If so then add it to the appropriate category and set the policy.

This is the first time I've really played around with MediaWiki's code so I don't know if it will work as intended but this should also solve the problem of NOINDEX/INDEX overriding a policy set in $wgArticleRobotPolicies.

attachment tryingagain.patch ignored as obsolete

bzimport added a comment.Via ConduitJan 20 2009, 11:41 AM

jopiswezggzmw wrote:

Proposed patch v4

Tidied up the code a little.

attachment tc_noindex_v4.patch ignored as obsolete

bzimport added a comment.Via ConduitMay 12 2009, 2:20 AM

jopiswezggzmw wrote:

New patch

Some improvements

Attached: patch.diff

Catrope added a comment.Via ConduitMay 12 2009, 9:09 AM

Wouldn't it be a better idea to track this stuff in the page_props table, like we do with HIDDENCAT ?

bzimport added a comment.Via ConduitMay 12 2009, 10:04 AM

jopiswezggzmw wrote:

HIDDENCAT also adds the page to [[Category:Hidden categories]].

bzimport added a comment.Via ConduitMay 12 2009, 5:00 PM

happy.melon.wiki wrote:

*YES*. This is the *perfect* solution. The situation is very similar, it's a 'property' that applies to individual pages that can be stored coherently in the page_props table, and the db query can be done in OutputPage.php rather than the parser. Is [[Category:Hidden categories]] populated 'normally', with links in the categorylinks table? Or is it generated entirely from page_props? There's probably no reason why a [[Category:Noindexed pages]] can't be dynamically-generated; it would additionally allow the categorisation to be filtered by NOINDEX tags that are functional (are suppressing indexing) and those that are not (ie are being overridden by other policies). This would make resolving bug14900 very much easier, as well. Great idea, Roan!

bzimport added a comment.Via ConduitMay 12 2009, 8:52 PM

jopiswezggzmw wrote:

[[Category:Hidden categories]] is populated using the categorylinks table. My patch resolves bug14900 anyway (if the page is in $wgArticleRobotPolicies then NOINDEX/INDEX have no effect) but perhaps using page_props would be better.

bzimport added a comment.Via ConduitSep 20 2009, 10:14 PM

happy.melon.wiki wrote:

done in r56688.

Add Comment