Page MenuHomePhabricator

Write some version of foreachwiki(indblist) that respects replag and/or has some --delay parameter between wikis
Open, MediumPublic

Description

What it says on the tin. Sometimes it would be nice to run foreachwiki(indblist) but slow down a lot in between each entry. Mostly useful for running MW scripts that are DB-intensive.

(I have no idea what to tag this with)

Event Timeline

demon triaged this task as Medium priority.Feb 21 2018, 2:12 AM
demon created this task.
mmodell subscribed.

(I have no idea what to tag this with)

ftfy

@demon As far as I know, individual maintenance scripts (even if run via foreachwiki) already take care of wait-for-replag, and in case of major writes, they typically also have a configurable sleep cycle, which can be passed down from foreachwiki. Is this not working well and/or what are you looking for specifically?

If a script has wait-for-slaves support, that would be between individual passes of that script for that wiki. I'm talking about a script that would do a single heavy query for a wiki, then exit. I'd like to wait before doing the next one.

The motivation was T187845, where we do some very expensive SELECT COUNT(*).... calls, and I thought it'd be nice to let things cool off before moving on so I'm not hammering a slave with a ton of expensive queries for a low-priority batch operation.

But if you think I'm premature optimizing here, we can decline.

If a script has wait-for-slaves support, that would be between individual passes of that script for that wiki. I'm talking about a script that would do a single heavy query for a wiki, then exit. I'd like to wait before doing the next one.

On T176754 I've built a maintenance script that runs in a loop and at the end of each iteration waits for replication to catch up. The decision whether to continue with the next iteration or exit the script would be done after waiting for lag to settle, so you'd end up doing that between runs for different wikis. Why I'm explaining that here: As I understood it all maintenance scripts should work that way. But probably not all do, it seems more realistic to have this solved in foreachwiki(indblist) than to find and fix all maintenance scripts.

[..] maintenance script that runs in a loop and at the end of each iteration waits for replication to catch up. The decision whether to continue with the next iteration or exit the script would be done after waiting for lag to settle, [..] all maintenance scripts should work that way. But probably not all do, it seems more realistic to have this solved in foreachwiki(indblist) than to find and fix all maintenance scripts.

Actually, I think it would be preferable to solve within our maintenance scripts. It should be fairly easy to do and has the benefit of keeping MediaWiki-related logic contained within the code base, and also automatically improves all uses of maintenance scripts, instead of only these specific wrappers used in wmf-production. These bash-based wrappers should be as simple as possible. And adding MediaWiki-specific logic to them, and logic to connect with mysql would actually be fairly complicated, and possibly also require extra maintenance in a separate repository (puppet instead of mediawiki/core), which we'll likely forget and break sometimes.