Ideally it would use some "majority of non-zero load slaves" logic rather than "all slaves up-to-date" or such.
This is a more complex issue than it seems, aaron. Because we have "groups", if all slaves are up to date but one, the recentchanges one, is behind, we have now a degradation of service. Ideally, in that case, that service jumps to a secondary server, or to the main traffic one; in practice, I do not see that happening (specially due to special partitioning and buffer pool status).
At present, because latest issues, we have setup the dump slave (the one more prone to get behind) with load 0 and all others with lag at least 1 so they are taken into account for lag, even if they shouln't receive main traffic.
This is not an easy task, and most of it has to do with slave groups, which are at the same time a great idea and a threat for availability.
I don't understand how we can implement the task as described. It's intentional that write-heavy maintenance scripts go at the speed of the slowest slave. If you only wait for a majority then you could have 50% of slaves permanently lagged, potentially by days or weeks.
There are lots of ways to go about this. If there is just one server lagging, then it might be better to redirect traffic away from it rather than slow down for it (or stop-the-world if it is totally broken). If the lagged servers are only in vslow/dump, it might make sense to tolerate more lag; this helps for spikes of jobs though not for overall permanently high load. It gets complicated fairly quickly though...I can't think of anything simple.
Talked about at the offsite. Decided it probably doesn't make sense to exclude a lagged slave only for the purposes of LB:waitForAll(). Instead, these slaves should possibly be depooled entirely. For example via MediaWiki's rdbms/LoadMonitor in APC, or via a proxy of sorts.
Merged into T180918.