Page MenuHomePhabricator

Increase concurrency of the cirrusCheckerJob
Closed, ResolvedPublic

Description

As we introduced a new elastic cluster to write to (cloudelastic) the CheckerJob has now more work to do.
We tuned the cloudelastic cluster to keep up with the synchronous writes but we are still behind on the checker job.

Event Timeline

Change 532357 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/services/change-propagation/jobqueue-deploy@master] [TEMP] cirusSearchCheckerJob: Increase concurrency to 20

https://gerrit.wikimedia.org/r/532357

Change 532357 merged by Mobrovac:
[mediawiki/services/change-propagation/jobqueue-deploy@master] [TEMP] cirusSearchCheckerJob: Increase concurrency to 20

https://gerrit.wikimedia.org/r/532357

Mentioned in SAL (#wikimedia-operations) [2019-08-26T11:55:43Z] <mobrovac@deploy1001> Started deploy [cpjobqueue/deploy@e742ecf]: Increase the concurrency of cirusSearchCheckerJobs to 20 - T231194

Mentioned in SAL (#wikimedia-operations) [2019-08-26T11:57:13Z] <mobrovac@deploy1001> Finished deploy [cpjobqueue/deploy@e742ecf]: Increase the concurrency of cirusSearchCheckerJobs to 20 - T231194 (duration: 01m 31s)

mobrovac changed the task status from Open to Stalled.Aug 26 2019, 12:10 PM

Stalling until the backlog has been cleared. Once that happens, we'll likely want to decrease the concurrency slightly.

Change 532556 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/services/change-propagation/jobqueue-deploy@master] cirrusSearchLinksUpdate: Increase concurrency to 150

https://gerrit.wikimedia.org/r/532556

Change 532556 merged by Mobrovac:
[mediawiki/services/change-propagation/jobqueue-deploy@master] cirrusSearchLinksUpdate: Increase concurrency to 150

https://gerrit.wikimedia.org/r/532556

Mentioned in SAL (#wikimedia-operations) [2019-08-27T09:09:56Z] <mobrovac@deploy1001> Started deploy [cpjobqueue/deploy@c2bc1a3]: Increase cirrusSearchLinksUpdate concurrency to 150 - T231194

Mentioned in SAL (#wikimedia-operations) [2019-08-27T09:11:05Z] <mobrovac@deploy1001> Finished deploy [cpjobqueue/deploy@c2bc1a3]: Increase cirrusSearchLinksUpdate concurrency to 150 - T231194 (duration: 01m 09s)

Change 532694 had a related patch set uploaded (by DCausse; owner: DCausse):
[operations/mediawiki-config@master] [cirrus] Stop generating new cirrusSearchChecker jobs

https://gerrit.wikimedia.org/r/532694

Change 532694 merged by jenkins-bot:
[operations/mediawiki-config@master] [cirrus] Stop generating new cirrusSearchChecker jobs

https://gerrit.wikimedia.org/r/532694

Mentioned in SAL (#wikimedia-operations) [2019-08-27T11:49:24Z] <dcausse@deploy1001> Synchronized wmf-config/CirrusSearch-production.php: T231194 [cirrus] Stop generating new cirrusSearchChecker jobs (duration: 00m 45s)

Change 533842 had a related patch set uploaded (by DCausse; owner: DCausse):
[operations/mediawiki-config@master] [cirrus] Reenable sanity checks

https://gerrit.wikimedia.org/r/533842

Change 533842 merged by jenkins-bot:
[operations/mediawiki-config@master] [cirrus] Reenable sanity checks

https://gerrit.wikimedia.org/r/533842

Mentioned in SAL (#wikimedia-operations) [2019-09-04T11:49:34Z] <dcausse@deploy1001> Synchronized wmf-config/CirrusSearch-production.php: T231194: [cirrus] Reenable sanity checks (duration: 00m 56s)

Thanks @mobrovac, we re-enabled the sanitizer and the topics were stable. closing.