EtcdConfig.php:202
Closed, ResolvedPublicPRODUCTION ERROR
Actions

Assigned To

Authored By

	• mmodell
	Jul 1 2020, 6:16 PM

Description

Error

MediaWiki version: 1.35.0-wmf.38

message

Uncaught ConfigException: Failed to load configuration from etcd:  in /srv/mediawiki/php-1.35.0-wmf.38/includes/config/EtcdConfig.php:202

Impact

spurious but worrying production error logspam

Notes

This should not be possible

Only occurrences over the past 7 days took place around 2020-07-01T18:00:00 in 1.35.0-wmf.38 and 1.35.0-wmf.39

Details

Request ID: d0022155-0fb7-4b4c-89cf-659e7b605b33
Request URL: https://en.wikipedia.org/wiki/Lipoprotein

Stack Trace

exception.trace

#0 /srv/mediawiki/php-1.35.0-wmf.38/includes/config/EtcdConfig.php(124): EtcdConfig->load()
#1 /srv/mediawiki/wmf-config/CommonSettings.php(132): EtcdConfig->getModifiedIndex()
#2 /srv/mediawiki/php-1.35.0-wmf.38/LocalSettings.php(4): require('/srv/mediawiki/...')
#3 /srv/mediawiki/php-1.35.0-wmf.38/includes/Setup.php(143): require_once('/srv/mediawiki/...')
#4 /srv/mediawiki/php-1.35.0-wmf.38/includes/WebStart.php(89): require_once('/srv/mediawiki/...')
#5 /srv/mediawiki/php-1.35.0-wmf.38/index.php(44): require('/srv/mediawiki/...')
#6 /srv/mediawiki/w/index.php(3): require('/srv/mediawiki/...')
#7 {main}
  thrown

Related Objects

Mentioned In: T230037: Create warmup procedure for MediaWiki app servers

Event Timeline

• mmodell created this task.Jul 1 2020, 6:16 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 1 2020, 6:16 PM

taavi subscribed.Jul 1 2020, 6:19 PM

• mmodell updated the task description. (Show Details)Jul 1 2020, 6:38 PM

• mmodell updated the task description. (Show Details)

• mmodell moved this task from Untriaged to Jul 2020 on the Wikimedia-production-error board.Jul 1 2020, 6:48 PM

CDanis subscribed.Jul 1 2020, 6:48 PM

		if ( $loop->invoke() !== WaitConditionLoop::CONDITION_REACHED ) {
			// No cached value exists and etcd query failed; throw an error
			throw new ConfigException( "Failed to load configuration from etcd: $error" );
		}

WaitConditionLoop is working with an timeout where this exception could be thrown, when the timeout is reached.

It seems there was no $error set, when reaching the exception. That is possible with timeout.

Maybe the result of invoke should be evaluated a bit stronger for WaitConditionLoop::CONDITION_TIMED_OUT/CONDITION_FAILED/CONDITION_ABORTED

CDanis added a subscriber: aaron.Jul 1 2020, 7:06 PM

Krinkle mentioned this in T230037: Create warmup procedure for MediaWiki app servers.Jul 16 2020, 11:32 PM

I guess it's normal that etcd can timeout in rare cases. The request would fatal in that case and leave the user stranded with a system error page. In the future such cases where we fail with HTTP 5xx and know it happened early and is safe to restart, perhaps we can handle that automatically in our infrastructure but that's a separate task. For now, given we saw one in over a month, that seems expected.

We have the general Scap and Icingla alerts to detect any spikes in this and other fatals.

Uncaught ConfigException: Failed to load configuration from etcd: in /srv/mediawiki/php-1.35.0-wmf.38/includes/config/EtcdConfig.php:202Closed, ResolvedPublicPRODUCTION ERRORActions