This is a ubn! task that needs to be resolved ASAP. made per @Lydia_Pintscher
Description
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Invalid | None | T108944 [Epic] Improve change dispatching | |||
| Resolved | hoo | T179060 Dispatchers occasionally seem to "freeze" for certain wikis | |||
| Resolved | Addshore | T178624 enwiki dispatch lag is 8 hours |
Event Timeline
This is fine again, @Addshore ran extra dispatchers yesterday.
I investigated the problem, but couldn't find anything useful… if this happens again while I'm around, I might be able to dig into this. Nevertheless I opened T178652: Wikidata dispatchers should use a LockManager with a short TTL which may or may not be related.
we had dispatch lag again today (enwiki especially lagged).
I ran an extra dispatcher, with some of the script parameters tweaked. (shorter --sleep time, since it was sleeping a lot when there was still a lot to dispatch)
also, I think --lock-grace-interval could be set to a smaller value and possibly also --dispatch-interval
Sadly --lock-grace-interval doesn't work with the LockManagerlogging backend. See also T178652: Wikidata dispatchers should use a LockManager with a short TTL.
@hoo ok :/
adjusting sleep time helped when I ran the extra dispatcher and I think shorter dispatch-interval also helped some
So, while fixing the issue of 8 hours of lag by running extra dispatches, I inadvertently locked all clients from having changes dispatched for 1 hour.
I don't think the underlying issue of the 8 hours of lag for enwiki was caused by the long lock time, but the long lock time should also be addressed.
Change 386591 had a related patch set uploaded (by Hoo man; owner: Hoo man):
[operations/puppet@production] Log Wikidata dispatchers on terbium