Page MenuHomePhabricator

Dismantle most of the old jobqueue infrastructure
Closed, ResolvedPublic

Description

Given we're managing just 0.04 jobs/second, it's time for a final blow to the supporting infrastructure.

Specifically:

  • Reduce the number of jobqueue workers to 2 per server, one "basic", one for gwt". They're more than enough
  • Remove servers from the redis cluster - specifically let's remove all but rdb1005/6 in eqiad and rdb2005/6 in codfw; the other servers in eqiad will be decommissioned, while the ones in codfw will be returned to spares.

Event Timeline

Change 439943 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/puppet@production] jobrunner: reduce the number of old runners

https://gerrit.wikimedia.org/r/439943

Change 439944 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/puppet@production] jobrunner: reduce to one redis server per datacenter

https://gerrit.wikimedia.org/r/439944

Change 439945 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/mediawiki-config@master] Reduce the jobqueue redis to use just one server per dc

https://gerrit.wikimedia.org/r/439945

Change 439943 merged by Giuseppe Lavagetto:
[operations/puppet@production] jobrunner: reduce the number of old runners

https://gerrit.wikimedia.org/r/439943

Change 439944 merged by Giuseppe Lavagetto:
[operations/puppet@production] jobrunner: reduce to one redis server per datacenter

https://gerrit.wikimedia.org/r/439944

Change 439945 merged by jenkins-bot:
[operations/mediawiki-config@master] Reduce the jobqueue redis to use just one server per dc

https://gerrit.wikimedia.org/r/439945

Mentioned in SAL (#wikimedia-operations) [2018-06-13T07:59:34Z] <oblivian@deploy1001> Synchronized wmf-config/ProductionServices.php: Remove unused redis shards from the jobqueue T197003 (duration: 00m 58s)

Vvjjkkii renamed this task from Dismantle most of the old jobqueue infrastructure to 06aaaaaaaa.Jul 1 2018, 1:05 AM
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed subscribers: gerritbot, Aklapper.
CommunityTechBot renamed this task from 06aaaaaaaa to Dismantle most of the old jobqueue infrastructure.Jul 2 2018, 2:05 PM
CommunityTechBot raised the priority of this task from High to Needs Triage.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added subscribers: gerritbot, Aklapper.
ArielGlenn triaged this task as Medium priority.Sep 3 2018, 8:23 AM
Joe claimed this task.