Page MenuHomePhabricator

Cleanup active-DC based MW config code and make it more robust and easy to change
Closed, ResolvedPublic

Description

A general issue is also that of config. MediaWiki has lots of config deciding what DB/swift to talk to. If stale config is still running in prod due to network partitions (breaking scap and so on), we can still have problems. One option is to have config switches around "what DC is active".

The active DC name could be pulled from a file, managed by ecd eventually (instead of the less reliable scap). We could have apaches go read only if that file is out of contact with etcd.

Details

Related Gerrit Patches:
operations/mediawiki-config : masterUse local resources in codfw for parsoid, url-downloader and mathoid
operations/mediawiki-config : masterUse ProductionServices for the jobqueue configuration
operations/mediawiki-config : masterUse wmfMasterDatacenter for picking the master redis config
operations/mediawiki-config : masterAdd references to wmfServices for Cirrusearch.
operations/mediawiki-config : masterDefine service entries for InitialiseSettings
operations/mediawiki-config : masterReduce poolcounter configuration complexity
operations/mediawiki-config : masterRationalize definition of service hosts

Event Timeline

aaron created this task.Sep 30 2015, 5:40 PM
aaron claimed this task.
aaron raised the priority of this task from to Normal.
aaron updated the task description. (Show Details)
aaron added subscribers: Krinkle, jcrespo, Glaisher and 13 others.
aaron renamed this task from Cleanup active-DC based MW code and make it more robust and easy to change to Cleanup active-DC based MW config code and make it more robust and easy to change.Oct 5 2015, 6:30 PM
aaron updated the task description. (Show Details)
aaron set Security to None.
Joe added a comment.Jan 20 2016, 4:47 PM

Other things will probably need the same kind of switching:

First thing that comes to mind is Elasticsearch

Joe added a comment.Jan 20 2016, 4:47 PM

Also: various redis configs, I guess

Change 266509 had a related patch set uploaded (by Giuseppe Lavagetto):
Rationalize definition of service hosts

https://gerrit.wikimedia.org/r/266509

Change 266510 had a related patch set uploaded (by Giuseppe Lavagetto):
Define Production service entries for InitialiseSettings

https://gerrit.wikimedia.org/r/266510

Change 266511 had a related patch set uploaded (by Giuseppe Lavagetto):
Reduce poolcounter configuration complexity

https://gerrit.wikimedia.org/r/266511

Change 266512 had a related patch set uploaded (by Giuseppe Lavagetto):
Add references to wmfServices for Cirrusearch.

https://gerrit.wikimedia.org/r/266512

Change 266513 had a related patch set uploaded (by Giuseppe Lavagetto):
Use wmfMasterDatacenter for picking the master redis config

https://gerrit.wikimedia.org/r/266513

Change 266509 merged by jenkins-bot:
Rationalize definition of service hosts

https://gerrit.wikimedia.org/r/266509

mark raised the priority of this task from Normal to High.Feb 10 2016, 3:52 PM

Change 266510 merged by jenkins-bot:
Define service entries for InitialiseSettings

https://gerrit.wikimedia.org/r/266510

Change 266511 merged by jenkins-bot:
Reduce poolcounter configuration complexity

https://gerrit.wikimedia.org/r/266511

Change 266512 merged by jenkins-bot:
Add references to wmfServices for Cirrusearch.

https://gerrit.wikimedia.org/r/266512

Change 266513 merged by jenkins-bot:
Use wmfMasterDatacenter for picking the master redis config

https://gerrit.wikimedia.org/r/266513

Joe claimed this task.Mar 7 2016, 10:17 AM
Joe moved this task from Backlog to In Progress on the codfw-rollout-Jan-Mar-2016 board.

Change 279350 had a related patch set uploaded (by Giuseppe Lavagetto):
Use ProductionServices for the jobqueue configuration

https://gerrit.wikimedia.org/r/279350

Change 279355 had a related patch set uploaded (by Giuseppe Lavagetto):
Use local resources in codfw for parsoid, url-downloader and mathoid

https://gerrit.wikimedia.org/r/279355

Krinkle removed a subscriber: Krinkle.Mar 30 2016, 2:41 AM

Change 279350 merged by jenkins-bot:
Use ProductionServices for the jobqueue configuration

https://gerrit.wikimedia.org/r/279350

Change 279355 merged by jenkins-bot:
Use local resources in codfw for parsoid, url-downloader and mathoid

https://gerrit.wikimedia.org/r/279355

Joe closed this task as Resolved.Apr 11 2016, 3:30 PM