Page MenuHomePhabricator

Disable magic word INDEX sitewide on eswiki
Closed, ResolvedPublic

Description

Hello. The Spanish Wikipedia community has decided to disable the magic word INDEX sitewide.

https://es.wikipedia.org/wiki/Wikipedia:Votaciones/2024/Desactivación_de_la_palabra_mágica_INDEX

Kind regards.

Event Timeline

Superpes15 subscribed.

@SRuizR Let me explain quickly: If we disable magic words, you cannot use both NOINDEX and INDEX (and we should do it on single namespace). Imho you should choose the namespace that you don't want to index, then we can set robot policy as "no index, no follow" for that namespace, and disable the use of NOINDEX and INDEX. But you cannot disable only INDEX sitewide - because also NOINDEX won't work and this is made per namespace (you can try using a filter for this)! So, if you know the namespaces that you want to set as noindex, we can proceed on those namespaces :)

@Superpes15 The vote was clear. The results were towards disabling the magic word INDEX sitewide. It was also clarified that the magic word NOINDEX was out of scope for that vote. Therefore, I suppose that disabling NOINDEX would violate community consensus, and therefore, the configuration change cannot be made.

If that is the case, you can proceed to decline the task and I would enforce the community decision via AbuseFilter.

Hello @SRuizR,

Thank you for creating this task. My name is Martin Urbanec and I am one of the system administrators responsible for performing site configuration changes. As of now, MediaWiki does not support restricting only the INDEX magic word – such a configuration setting simply does not exist, and as such, it cannot be changed. This means it is only possible to disable both the INDEX and NOINDEX magic words. However, this setting is a per-namespace one, which means we can disallow those magic words in one namespace, but leave them working in another. For more details, please see the docs at MediaWiki.org and documentation for the $wgExemptFromUserRobotsControl configuration setting.

In my opinion, this limitation makes sense, because only one of those two keywords make sense to use at any given page. Every namespace has a default indexing policy (either index, or do not index), and (except when disabled), the indexing policy can be overriden on per-page basis via either the INDEX magic word (if the default indexing policy is "do not index") or NOINDEX (if the default indexing policy is "index"). Using the other magic word makes limited sense, because whatever behavior is set by the other keyword is necessarily already the default behavior.

Specifically: At eswiki, the User namespace has a default policy of "do not index". This means that adding __NOINDEX__ to https://es.wikipedia.org/wiki/Usuario:Martin_Urbanec would not really do anything at all – it would just sit there, doing nothing at all. This is because https://es.wikipedia.org/wiki/Usuario:Martin_Urbanec is already not indexed by the search engines, so adding the __NOINDEX__ magic word results in no change at all. Similarly, adding __INDEX__ to eg. https://es.wikipedia.org/wiki/Praga would result in no change at all, because the Main namespace has a default index policy of "index", meaning the Praga article is already indexed, regardless of whether __INDEX__ is applied or not.

Because of what I described above, the INDEX magic word (which you requested to disable sitewide) only makes sense in the following namespaces, as those namespaces all have a default index policy of "do not index":

  • Talk:
  • User:
  • User talk:
  • Project talk:
  • File talk:
  • MediaWiki talk:
  • Template talk:
  • Help talk:
  • Category talk:
  • Portal discussion:
  • Wikiproyecto discussion:
  • Anexo discussion:
  • Module discussion:

In those namespaces, only the INDEX magic word makes sense to use. While the NOINDEX magic word can technically / theoretically be used in those namespaces, it does not actually do anything (unless and until the default index policy for the namespace changes to "index", at which point the NOINDEX magic word would preserve the previous situation).

Based on the on-wiki discussion and the Phabricator discussion, I suggest disabling both the INDEX and NOINDEX magic words in only the namespaces I listed above (which are the namespaces that are excluded from indexing by default, ie. namespaces where the INDEX magic word has some effect). While I understand this is not exactly what the community asked for at the link above, based on what I know, it does serve the goal of making it impossible for users to force indexing when the default namespace indexing policy is set to "do not index". Even though it would disable the NOINDEX magic word as well (in the namespaces listed), using this magic word is a no-op in those namespaces, and it would of course continue working in other namespaces as well.

If this suggestion does not suit your community needs, it would be great if you could describe (in non-technical terms) what the community needs are. If I know what the problems / needs to be solved are, I'll be able to suggest a technical solution that would meet those needs. It is fairly difficult to work on a Phabricator task that only asks for a certain technical solution (like it happened here), since often, the suggested technical solution is not technically possible (while there usually are different solutions that can serve the same goal). Please understand that I'm here to help you and others to meet their needs by configuration changes, but for that to be possible, I do need to first understand the needs. Hope this makes sense. As always, if you have any questions about how all this works, please feel free to ask!

Sincerely,
Martin Urbanec

@Urbanecm Hello. What you say makes a lot of sense. It is completely equivalent to what the community is requesting and it suits the community needs.

All right, lets disable INDEX and NOINDEX in the namespaces you listed.

Kind regards.

Indeed I don't see where my proposal (which is the same as the one made in more depth by Martin) violated the community consensus!

If you're happy with this (disabling both magic words in these namespaces), I can do the patch, thanks!

Indeed I don't see where my proposal (which is the same as the one made in more depth by Martin) violated the community consensus!

Yeah, I just didn't really understand it and didn't think about it.

You can proceed to do the patch.

Thank you very much.

Change 994254 had a related patch set uploaded (by Superpes15; author: Superpes15):

[operations/mediawiki-config@master] [eswiki] Add 13 namespaces to $wgExemptFromUserRobotsControl

https://gerrit.wikimedia.org/r/994254

No problem, I was not very clear, so it was my fault! I'll schedule the patch for the deployment as soon as possible :)

Change 994254 merged by jenkins-bot:

[operations/mediawiki-config@master] [eswiki] Add 13 namespaces to $wgExemptFromUserRobotsControl

https://gerrit.wikimedia.org/r/994254

Mentioned in SAL (#wikimedia-operations) [2024-01-30T22:00:43Z] <cjming@deploy2002> Started scap: Backport for [[gerrit:994254|[eswiki] Add 13 namespaces to $wgExemptFromUserRobotsControl (T355033)]]

Mentioned in SAL (#wikimedia-operations) [2024-01-30T22:02:15Z] <cjming@deploy2002> cjming and superpes: Backport for [[gerrit:994254|[eswiki] Add 13 namespaces to $wgExemptFromUserRobotsControl (T355033)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-01-30T22:09:07Z] <cjming@deploy2002> Finished scap: Backport for [[gerrit:994254|[eswiki] Add 13 namespaces to $wgExemptFromUserRobotsControl (T355033)]] (duration: 08m 24s)