Page MenuHomePhabricator

2023-01-17 Wikimedia full site outage
Closed, ResolvedPublic

Description

MediaWiki internal error.

Original exception: [253c2bfb-28cf-4967-b741-6d412143ae44] 2023-01-17 18:33:40: Fatal exception of type "ConfigException"

Exception caught inside exception handler.

Set $wgShowExceptionDetails = true; at the bottom of LocalSettings.php to show detailed debugging information.

exception.trace
from /srv/mediawiki/php-1.40.0-wmf.18/includes/config/GlobalVarConfig.php(59)
#0 /srv/mediawiki/php-1.40.0-wmf.18/extensions/DiscussionTools/includes/Hooks/HookUtils.php(340): GlobalVarConfig->get(string)
#1 /srv/mediawiki/php-1.40.0-wmf.18/extensions/DiscussionTools/includes/Hooks/HookUtils.php(414): MediaWiki\Extension\DiscussionTools\Hooks\HookUtils::isAvailableForTitle(Title, string)
#2 /srv/mediawiki/php-1.40.0-wmf.18/extensions/DiscussionTools/includes/Hooks/PageHooks.php(284): MediaWiki\Extension\DiscussionTools\Hooks\HookUtils::isFeatureEnabledForOutput(OutputPage, string)
#3 /srv/mediawiki/php-1.40.0-wmf.18/includes/HookContainer/HookContainer.php(160): MediaWiki\Extension\DiscussionTools\Hooks\PageHooks->onOutputPageBeforeHTML(OutputPage, string)
#4 /srv/mediawiki/php-1.40.0-wmf.18/includes/HookContainer/HookRunner.php(2651): MediaWiki\HookContainer\HookContainer->run(string, array)
#5 /srv/mediawiki/php-1.40.0-wmf.18/includes/OutputPage.php(2247): MediaWiki\HookContainer\HookRunner->onOutputPageBeforeHTML(OutputPage, string)
#6 /srv/mediawiki/php-1.40.0-wmf.18/includes/OutputPage.php(2259): OutputPage->addParserOutputText(ParserOutput, array)
#7 /srv/mediawiki/php-1.40.0-wmf.18/includes/page/Article.php(828): OutputPage->addParserOutput(ParserOutput, array)
#8 /srv/mediawiki/php-1.40.0-wmf.18/includes/page/Article.php(730): Article->doOutputFromRenderStatus(MediaWiki\Revision\RevisionStoreRecord, Status, OutputPage, array)
#9 /srv/mediawiki/php-1.40.0-wmf.18/includes/page/Article.php(533): Article->generateContentOutput(User, ParserOptions, integer, OutputPage, array)
#10 /srv/mediawiki/php-1.40.0-wmf.18/includes/actions/ViewAction.php(78): Article->view()
#11 /srv/mediawiki/php-1.40.0-wmf.18/includes/MediaWiki.php(551): ViewAction->show()
#12 /srv/mediawiki/php-1.40.0-wmf.18/includes/MediaWiki.php(328): MediaWiki->performAction(Article, Title)
#13 /srv/mediawiki/php-1.40.0-wmf.18/includes/MediaWiki.php(916): MediaWiki->performRequest()
#14 /srv/mediawiki/php-1.40.0-wmf.18/includes/MediaWiki.php(571): MediaWiki->main()
#15 /srv/mediawiki/php-1.40.0-wmf.18/index.php(50): MediaWiki->run()
#16 /srv/mediawiki/php-1.40.0-wmf.18/index.php(46): wfIndexMain()
#17 /srv/mediawiki/w/index.php(3): require(string)
#18 {main}

Event Timeline

Bugreporter triaged this task as Unbreak Now! priority.Jan 17 2023, 6:34 PM

This seems to affect all "normal" namespaces - special pages are still served normally. e.g. contribs

I got this too, on Commons and Wikisource.

Quite inconsistent. It may turn to normal when refreshed, only to go down again 20 seconds later.

Affects at least enwikt, nlwikt, enwiki, enwikivoyage, nlwiki, dewiki, dewikisource, mediawikiwiki, metawiki and foundationwiki. Probably just everything.

Occasionally a page comes through, might be cache though.

Affects at trwiki, trwikivoyage, Meta...

Affected as well

Edit: appears to be fixed

nowiki, wikidata, metawiki was affected, but is fixed as far as I can tell … Well done, whoever made it!

Commons and English Wikisource OK now. Thanks,

Zabe renamed this task from ConfigException to 2023-01-17 Wikimedia full site outage.Jan 17 2023, 7:27 PM

Summary of the problem for the curious, hopefully more accessible than the incident report:

We were deploying a modification to DiscussionTools that was removing a configuration option (879103). It had two distinct changes, in two files: one removed the code reading the config option, and one that removed the definition of the config option.

The code was correct, and worked as expected in initial testing, but it revealed a problem in the tools we use for deploying code – those two distinct changes were not deployed at the same time, but with a significant delay between them, causing the code to attempt to read a config option that was no longer defined, causing the "ConfigException" that everyone saw.

Since the delay was different on different servers, the sites were still accessible to some users, who were randomly hitting the servers that were using a consistent version of the code (either before or after the change).

The same modification to DiscussionTools was deployed again, with a small change to avoid the issue, a few minutes ago (880916).