Page MenuHomePhabricator

Page and Index namespaces from ProofreadPage extension no longer considered content namespaces since deploy of 1.44.0-wmf.21
Closed, ResolvedPublic

Description

On wikisource wikis the ProofreadPage extension adds appropriate namespaces directly to $wgContentNamespaces via the onSetupAfterCache hook. This has existed since 2015.

Due to some change in loading order with the 1.44.0-wmf.21 deployment the NamespaceInfo class is now initialized before the ProofreadPage hook handler can run, resulting in NamespaceInfo taking a copy of ContentNamespaces before the namespaces from ProofreadPage have been added. This results in the Page and Index namespaces no longer being considered content namespaces on these wikis.

Event Timeline

I'm not familiar enough with wikisource and ProofreadPage to know what effects this has on the wiki itself. From the search perspective this means much of the content of those pages is now difficult to search for and our systems are busy moving the pages between search indexes so they can be found again.

Some debugging info showing NamespaceInfo has "old" data without the extra namespaces: https://phabricator.wikimedia.org/P74261

Stack trace of when NamespaceInfo gets initialized in wmf.20: https://phabricator.wikimedia.org/P74264
Stack trace of when NamespaceInfo gets initialized in wmf.21: https://phabricator.wikimedia.org/P74262

Order of execution was verified by adding echo statements to the "right" places on mwdebug1002 and starting up mwscript shell.php

Rolling back out of caution since searchability is affected.

Tgr subscribed.

Some version of T288819: NamespaceInfo service missing namespaces if initialized too early, probably.
I should do a survey of how many other extensions use SetupAfterCache for risky config changes.

I should do a survey of how many other extensions use SetupAfterCache for risky config changes.

There's WikibaseCirrusSearch which updates $wgCirrusSearchExtraIndexSettings - not ideal but probably not many services depend on Cirrus. The other usages seem fine.

I'm not familiar enough with wikisource and ProofreadPage to know what effects this has on the wiki itself.

For Index it'll be a bit annoying for contributors, but Page is our content namespace. Mainspace on most Proofread Page-wikis only contains a bit of metadata and the necessary extension tags to transclude all those Page:-namespace wikipages. In other words, this issue is likely to have made almost all content on almost all Wikisourcen unsearchable.

jnuche triaged this task as Unbreak Now! priority.Mar 20 2025, 8:07 AM

Change #1129771 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/extensions/ProofreadPage@master] Use MediaWikiServices for early config changes

https://gerrit.wikimedia.org/r/1129771

Change #1129789 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/extensions/ProofreadPage@wmf/1.44.0-wmf.21] Use MediaWikiServices for early config changes

https://gerrit.wikimedia.org/r/1129789

Change #1129771 merged by jenkins-bot:

[mediawiki/extensions/ProofreadPage@master] Use MediaWikiServices for early config changes

https://gerrit.wikimedia.org/r/1129771

Change #1129789 merged by jenkins-bot:

[mediawiki/extensions/ProofreadPage@wmf/1.44.0-wmf.21] Use MediaWikiServices for early config changes

https://gerrit.wikimedia.org/r/1129789

Mentioned in SAL (#wikimedia-operations) [2025-03-20T11:41:16Z] <tgr@deploy2002> Started scap sync-world: Backport for [[gerrit:1129789|Use MediaWikiServices for early config changes (T288819 T389430)]]

Mentioned in SAL (#wikimedia-operations) [2025-03-20T11:48:12Z] <tgr@deploy2002> tgr: Backport for [[gerrit:1129789|Use MediaWikiServices for early config changes (T288819 T389430)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2025-03-20T12:10:51Z] <tgr@deploy2002> Finished scap sync-world: Backport for [[gerrit:1129789|Use MediaWikiServices for early config changes (T288819 T389430)]] (duration: 29m 34s)

Tgr claimed this task.

Some debugging info showing NamespaceInfo has "old" data without the extra namespaces: https://phabricator.wikimedia.org/P74261

Re-did this and got the correct result, so I think this is fixed.